会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 12. 发明授权
    • 3-D clipping in a graphics processing unit
    • 图形处理单元中的3-D剪辑
    • US08773459B2
    • 2014-07-08
    • US13524946
    • 2012-06-15
    • Guofang JiaoChun YuLingjun ChenYun Du
    • Guofang JiaoChun YuLingjun ChenYun Du
    • G09G5/00G06T1/20G06T15/30G06T19/00G06T11/40G06T15/00G06T11/60
    • G06T1/20G06T11/40G06T11/60G06T15/005G06T15/30G06T19/00G09G5/393
    • A graphics processing unit (GPU) efficiently performs 3-dimensional (3-D) clipping using processing units used for other graphics functions. The GPU includes first and second hardware units and at least one buffer. The first hardware unit performs 3-D clipping of primitives using a first processing unit used for a first graphics function, e.g., an ALU used for triangle setup, depth gradient setup, etc. The first hardware unit may perform 3-D clipping by (a) computing clip codes for each vertex of each primitive, (b) determining whether to pass, discard or clip each primitive based on the clip codes for all vertices of the primitive, and (c) clipping each primitive to be clipped against clipping planes. The second hardware unit computes attribute component values for new vertices resulting from the 3-D clipping, e.g., using an ALU used for attribute gradient setup, attribute interpolation, etc. The buffer(s) store intermediate results of the 3-D clipping.
    • 图形处理单元(GPU)使用用于其他图形功能的处理单元有效地执行三维(3-D)剪辑。 GPU包括第一和第二硬件单元和至少一个缓冲器。 第一硬件单元使用用于第一图形功能的第一处理单元(例如用于三角形设置的ALU,深度梯度设置等)来对原语执行3-D限幅。第一硬件单元可以通过( a)计算每个图元的每个顶点的剪辑代码,(b)基于所述基元的所有顶点的剪辑代码来确定是否传递,丢弃或剪切每个图元,以及(c)剪切要针对剪切平面剪切的每个图元 。 第二硬件单元计算由3-D限幅产生的新顶点的属性分量值,例如使用用于属性梯度设置,属性插值等的ALU。该缓冲器存储3-D限幅的中间结果。
    • 13. 发明授权
    • Convolution filtering in a graphics processor
    • 图形处理器中的卷积滤波
    • US08644643B2
    • 2014-02-04
    • US11453436
    • 2006-06-14
    • Guofang JiaoYun DuChun YuLingjun Chen
    • Guofang JiaoYun DuChun YuLingjun Chen
    • G06K9/40
    • G06F17/153G06T5/20G06T15/04
    • Techniques for performing convolution filtering using hardware normally available in a graphics processor are described. Convolution filtering of an arbitrary H×W grid of pixels is achieved by partitioning the grid into smaller sections, performing computation for each section, and combining the intermediate results for all sections to obtain a final result. In one design, a command to perform convolution filtering on a grid of pixels with a kernel of coefficients is received, e.g., from a graphics application. The grid is partitioned into multiple sections, where each section may be 2×2 or smaller. Multiple instructions are generated for the multiple sections, with each instruction performing convolution computation on at least one pixel in one section. Each instruction may include pixel position information and applicable kernel coefficients. Instructions to combine the intermediate results from the multiple instructions are also generated.
    • 描述使用图形处理器中通常可用的硬件执行卷积滤波的技术。 通过将网格划分为更小的部分,对每个部分进行计算,并组合所有部分的中间结果以获得最终结果,实现了任意H×W像素网格的卷积滤波。 在一种设计中,例如从图形应用程序接收用于对具有系数内核的像素网格进行卷积滤波的命令。 网格划分为多个部分,每个部分可能为2×2或更小。 为多个部分生成多个指令,每个指令在一个部分中的至少一个像素上执行卷积计算。 每个指令可以包括像素位置信息和可应用的内核系数。 还会生成组合来自多条指令的中间结果的指令。
    • 14. 发明申请
    • GRAPHICS PROCESSING UNIT WITH DEFERRED VERTEX SHADING
    • 图形处理单元,带有VERTEX SHADING
    • US20100302246A1
    • 2010-12-02
    • US12557427
    • 2009-09-10
    • Guofang JiaoYun DuLingjun ChenChun Yu
    • Guofang JiaoYun DuLingjun ChenChun Yu
    • G06T1/20G06T15/60
    • G06T15/40G06T1/20G06T15/005
    • Techniques are described for processing graphics images with a graphics processing unit (GPU) using deferred vertex shading. An example method includes the following: generating, within a processing pipeline of a graphics processing unit (GPU), vertex coordinates for vertices of each primitive within an image geometry, wherein the vertex coordinates comprise a location and a perspective parameter for each one of the vertices, and wherein the image geometry represents a graphics image; identifying, within the processing pipeline of the GPU, visible primitives within the image geometry based upon the vertex coordinates; and, responsive to identifying the visible primitives, generating, within the processing pipeline of the GPU, vertex attributes only for the vertices of the visible primitives in order to determine surface properties of the graphics image.
    • 描述了使用延迟顶点着色处理具有图形处理单元(GPU)的图形图像的技术。 示例性方法包括以下:在图形处理单元(GPU)的处理流水线内生成图像几何中每个图元的顶点的顶点坐标,其中顶点坐标包括位置和透视参数 顶点,并且其中图像几何表示图形图像; 在GPU的处理流水线内识别基于顶点坐标的图像几何图形内的可见原始图形; 并且响应于识别可见原语,在GPU的处理流水线内生成仅针对可见图元的顶点的顶点属性,以便确定图形图像的表面特性。
    • 15. 发明申请
    • Processing of Command Sub-Lists by Multiple Graphics Processing Units
    • 通过多个图形处理单元处理命令子列表
    • US20080055326A1
    • 2008-03-06
    • US11469932
    • 2006-09-05
    • Yun DuChun YuGuofang JiaoLingjun Chen
    • Yun DuChun YuGuofang JiaoLingjun Chen
    • G09G5/36
    • G06T1/60G09G5/36
    • Techniques to allow multiple graphics processing units to operate in parallel, even with limited storage space, are described. An apparatus includes first and second processing units and a memory. The first processing unit performs pre-processing on a batch of graphics application data for an image (e.g., for vertices in the image) and generates command sub-lists for the batch. The second processing unit performs post-processing on the command sub-lists (e.g., for pixels of the image) and generates output data for the image. The first and second processing units may operate in parallel on different command sub-lists. The memory stores the command sub-lists and may also store a header for each command sub-list, a look-up table of memory addresses for the command sub-lists, a write counter indicating the most recently generated command sub-list, and a read counter indicating the most recently post-processed command sub-list.
    • 描述了允许多个图形处理单元并行操作的技术,即使在有限的存储空间的情况下。 一种装置包括第一和第二处理单元和存储器。 第一处理单元对图像的一批图形应用数据(例如,用于图像中的顶点)执行预处理,并且生成该批次的命令子列表。 第二处理单元对命令子列表执行后处理(例如,针对图像的像素),并且生成图像的输出数据。 第一和第二处理单元可以在不同的命令子列表上并行操作。 存储器存储命令子列表,并且还可以存储每个命令子列表的头部,命令子列表的存储器地址的查找表,指示最近生成的命令子列表的写入计数器,以及 指示最近后处理的命令子列表的读计数器。
    • 16. 发明申请
    • Convolution filtering in a graphics processor
    • 图形处理器中的卷积滤波
    • US20070292047A1
    • 2007-12-20
    • US11453436
    • 2006-06-14
    • Guofang JiaoYun DuChun YuLingjun Chen
    • Guofang JiaoYun DuChun YuLingjun Chen
    • G06K9/64
    • G06F17/153G06T5/20G06T15/04
    • Techniques for performing convolution filtering using hardware normally available in a graphics processor are described. Convolution filtering of an arbitrary H×W grid of pixels is achieved by partitioning the grid into smaller sections, performing computation for each section, and combining the intermediate results for all sections to obtain a final result. In one design, a command to perform convolution filtering on a grid of pixels with a kernel of coefficients is received, e.g., from a graphics application. The grid is partitioned into multiple sections, where each section may be 2×2 or smaller. Multiple instructions are generated for the multiple sections, with each instruction performing convolution computation on at least one pixel in one section. Each instruction may include pixel position information and applicable kernel coefficients. Instructions to combine the intermediate results from the multiple instructions are also generated.
    • 描述使用图形处理器中通常可用的硬件执行卷积滤波的技术。 通过将网格划分为更小的部分,对每个部分执行计算,并组合所有部分的中间结果以获得最终结果,实现了任意HxW像素网格的卷积滤波。 在一种设计中,例如从图形应用程序接收用于对具有系数内核的像素网格进行卷积滤波的命令。 网格被划分成多个部分,其中每个部分可以是2x2或更小。 为多个部分生成多个指令,每个指令在一个部分中的至少一个像素上执行卷积计算。 每个指令可以包括像素位置信息和可应用的内核系数。 还会生成组合来自多条指令的中间结果的指令。
    • 17. 发明授权
    • Multi-threaded processor with deferred thread output control
    • 具有延迟线程输出控制的多线程处理器
    • US08869147B2
    • 2014-10-21
    • US11445100
    • 2006-05-31
    • Yun DuGuofang JiaoChun Yu
    • Yun DuGuofang JiaoChun Yu
    • G06F9/46G06F9/48G06F9/30G06F9/38
    • G06F9/4881G06F9/30123G06F9/3836G06F9/3851G06F9/3855G06F9/3857Y02D10/24
    • A multi-threaded processor is provided that internally reorders output threads thereby avoiding the need for an external output reorder buffer. The multi-threaded processor writes its thread results back to an internal memory buffer to guarantee that thread results are outputted in the same order in which the threads are received. A thread scheduler within the multi-threaded processor manages thread ordering control to avoid the need for an external reorder buffer. A compiler for the multi-threaded processor converts instructions that would normally send processed results directly to an external reorder buffer so that the processed thread results are instead sent to the internal memory buffer of the multi-threaded processor.
    • 提供一种多线程处理器,其内部重新排序输出线程,从而避免需要外部输出重排序缓冲器。 多线程处理器将其线程结果写回内部存储器缓冲区,以保证以与接收线程相同的顺序输出线程结果。 多线程处理器内的线程调度器管理线程排序控制,以避免需要外部重排序缓冲区。 用于多线程处理器的编译器将通常将处理结果直接发送到外部重排序缓冲器的指令转换成经处理的线程结果而不是发送到多线程处理器的内部存储器缓冲区。
    • 18. 发明授权
    • Unified virtual addressed register file
    • 统一的虚拟寻址寄存器文件
    • US08766996B2
    • 2014-07-01
    • US11472701
    • 2006-06-21
    • Yun DuGuofang JiaoChun YuDe Dzwo Hsu
    • Yun DuGuofang JiaoChun YuDe Dzwo Hsu
    • G09G5/36
    • G06F9/3851G06F9/3012G06F9/30123G06F9/30138G06F9/384G06T15/005
    • A multi-threaded processor is provided, such as a shader processor, having an internal unified memory space that is shared by a plurality of threads and is dynamically assigned to threads as needed. A mapping table that maps virtual registers to available internal addresses in the unified memory space so that thread registers can be stored in contiguous or non-contiguous memory addresses. Dynamic sizing of the virtual registers allows flexible allocation of the unified memory space depending on the type and size of data in a thread register. Yet another feature provides an efficient method for storing graphics data in the unified memory space to improve fetch and store operations from the memory space. In particular, pixel data for four pixels in a thread are stored across four memory devices having independent input/output ports that permit the four pixels to be read in a single clock cycle for processing.
    • 提供了多线程处理器,例如着色器处理器,具有由多个线程共享的内部统一存储器空间,并且根据需要动态分配给线程。 映射表将虚拟寄存器映射到统一存储空间中的可用内部地址,以便线程寄存器可以存储在连续或不连续的存储器地址中。 虚拟寄存器的动态大小允许根据线程寄存器中数据的类型和大小灵活分配统一存储空间。 另一个特征提供了用于将统计存储器空间中的图形数据存储以改善从存储器空间获取和存储操作的有效方法。 特别地,线程中的四个像素的像素数据被存储在具有独立输入/输出端口的四个存储器件中,这些存储器件允许以单个时钟周期读取四个像素进行处理。
    • 19. 发明授权
    • Graphics processors with parallel scheduling and execution of threads
    • 具有并行调度和线程执行的图形处理器
    • US08345053B2
    • 2013-01-01
    • US11533880
    • 2006-09-21
    • Guofang JiaoYun DuChun Yu
    • Guofang JiaoYun DuChun Yu
    • G06F15/80G06F15/00G06T1/00
    • G06T15/005
    • A graphics processor capable of parallel scheduling and execution of multiple threads, and techniques for achieving parallel scheduling and execution, are described. The graphics processor may include multiple hardware units and a scheduler. The hardware units are operable in parallel, with each hardware unit supporting a respective set of operations. The hardware units may include an ALU core, an elementary function core, a logic core, a texture sampler, a load control unit, some other hardware unit, or a combination thereof. The scheduler dispatches instructions for multiple threads to the hardware units concurrently. The graphics processor may further include an instruction cache to store instructions for threads and register banks to store data. The instruction cache and register banks may be shared by the hardware units.
    • 描述了能够并行调度和执行多个线程的图形处理器以及用于实现并行调度和执行的技术。 图形处理器可以包括多个硬件单元和调度器。 硬件单元可并行操作,每个硬件单元支持相应的一组操作。 硬件单元可以包括ALU核,基本功能核心,逻辑核心,纹理采样器,负载控制单元,一些其他硬件单元或其组合。 调度器将多个线程的指令同时分配到硬件单元。 图形处理器还可以包括指令高速缓存以存储线程和寄存器组以存储数据的指令。 指令高速缓存和寄存器组可以由硬件单元共享。
    • 20. 发明授权
    • On-demand multi-thread multimedia processor
    • 按需多线程多媒体处理器
    • US07685409B2
    • 2010-03-23
    • US11677362
    • 2007-02-21
    • Yun DuGuofang JiaoChun Yu
    • Yun DuGuofang JiaoChun Yu
    • G06F9/00
    • G06F12/0842G06F9/30145G06F9/30167G06F9/382G06F9/383G06F9/3851G06F9/3885G06F9/45558G06F9/5016G06F12/10G06F2009/45579G06F2009/45583Y02D10/13Y02D10/22
    • A device includes a multimedia processor that can concurrently support multiple applications for various types of multimedia such as graphics, audio, video, camera, games, etc. The multimedia processor includes configurable storage resources to store instructions, data, and state information for the applications and assignable processing units to perform various types of processing for the applications. The configurable storage resources may include an instruction cache to store instructions for the applications, register banks to store data for the applications, context registers to store state information for threads of the applications, etc. The processing units may include an arithmetic logic unit (ALU) core, an elementary function core, a logic core, a texture sampler, a load control unit, a flow controller, etc. The multimedia processor allocates a configurable portion of the storage resources to each application and dynamically assigns the processing units to the applications as requested by these applications.
    • 一种设备包括多媒体处理器,其可以同时支持用于各种类型的多媒体(例如图形,音频,视频,照相机,游戏等)的多个应用。多媒体处理器包括可配置的存储资源以存储用于应用的指令,数据和状态信息 以及可分配处理单元来执行用于应用的各种类型的处理。 可配置的存储资源可以包括用于存储用于应用的指令的指令高速缓存,寄存器组存储用于应用的数据,上下文寄存器以存储用于应用的线程的状态信息等。处理单元可以包括算术逻辑单元(ALU )核心,基本功能核心,逻辑核心,纹理采样器,负载控制单元,流量控制器等。多媒体处理器将存储资源的可配置部分分配给每个应用,并且将处理单元动态地分配给应用 按照这些应用的要求。