专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

21. 发明授权

US08345053B2 Graphics processors with parallel scheduling and execution of threads 有权
标题翻译：具有并行调度和线程执行的图形处理器
公开(公告)号：US08345053B2
公开(公告)日：2013-01-01
申请号：US11533880
申请日：2006-09-21
申请人： Guofang Jiao , Yun Du , Chun Yu
发明人： Guofang Jiao , Yun Du , Chun Yu
IPC分类号： G06F15/80 , G06F15/00 , G06T1/00
CPC分类号： G06T15/005
摘要： A graphics processor capable of parallel scheduling and execution of multiple threads, and techniques for achieving parallel scheduling and execution, are described. The graphics processor may include multiple hardware units and a scheduler. The hardware units are operable in parallel, with each hardware unit supporting a respective set of operations. The hardware units may include an ALU core, an elementary function core, a logic core, a texture sampler, a load control unit, some other hardware unit, or a combination thereof. The scheduler dispatches instructions for multiple threads to the hardware units concurrently. The graphics processor may further include an instruction cache to store instructions for threads and register banks to store data. The instruction cache and register banks may be shared by the hardware units.
摘要翻译：描述了能够并行调度和执行多个线程的图形处理器以及用于实现并行调度和执行的技术。图形处理器可以包括多个硬件单元和调度器。硬件单元可并行操作，每个硬件单元支持相应的一组操作。硬件单元可以包括ALU核，基本功能核心，逻辑核心，纹理采样器，负载控制单元，一些其他硬件单元或其组合。调度器将多个线程的指令同时分配到硬件单元。图形处理器还可以包括指令高速缓存以存储线程和寄存器组以存储数据的指令。指令高速缓存和寄存器组可以由硬件单元共享。

22. 发明授权

US07685409B2 On-demand multi-thread multimedia processor 有权
标题翻译：按需多线程多媒体处理器
公开(公告)号：US07685409B2
公开(公告)日：2010-03-23
申请号：US11677362
申请日：2007-02-21
申请人： Yun Du , Guofang Jiao , Chun Yu
发明人： Yun Du , Guofang Jiao , Chun Yu
IPC分类号： G06F9/00
CPC分类号： G06F12/0842 , G06F9/30145 , G06F9/30167 , G06F9/382 , G06F9/383 , G06F9/3851 , G06F9/3885 , G06F9/45558 , G06F9/5016 , G06F12/10 , G06F2009/45579 , G06F2009/45583 , Y02D10/13 , Y02D10/22
摘要： A device includes a multimedia processor that can concurrently support multiple applications for various types of multimedia such as graphics, audio, video, camera, games, etc. The multimedia processor includes configurable storage resources to store instructions, data, and state information for the applications and assignable processing units to perform various types of processing for the applications. The configurable storage resources may include an instruction cache to store instructions for the applications, register banks to store data for the applications, context registers to store state information for threads of the applications, etc. The processing units may include an arithmetic logic unit (ALU) core, an elementary function core, a logic core, a texture sampler, a load control unit, a flow controller, etc. The multimedia processor allocates a configurable portion of the storage resources to each application and dynamically assigns the processing units to the applications as requested by these applications.
摘要翻译：一种设备包括多媒体处理器，其可以同时支持用于各种类型的多媒体（例如图形，音频，视频，照相机，游戏等）的多个应用。多媒体处理器包括可配置的存储资源以存储用于应用的指令，数据和状态信息以及可分配处理单元来执行用于应用的各种类型的处理。可配置的存储资源可以包括用于存储用于应用的指令的指令高速缓存，寄存器组存储用于应用的数据，上下文寄存器以存储用于应用的线程的状态信息等。处理单元可以包括算术逻辑单元（ALU ）核心，基本功能核心，逻辑核心，纹理采样器，负载控制单元，流量控制器等。多媒体处理器将存储资源的可配置部分分配给每个应用，并且将处理单元动态地分配给应用按照这些应用的要求。

23. 发明申请

US20090073168A1 FRAGMENT SHADER BYPASS IN A GRAPHICS PROCESSING UNIT, AND APPARATUS AND METHOD THEREOF 有权
标题翻译：图形处理单元中的片状阴影旁边，及其装置及方法
公开(公告)号：US20090073168A1
公开(公告)日：2009-03-19
申请号：US11855832
申请日：2007-09-14
申请人： Guofang Jiao , Yun Du , Chun Yu
发明人： Guofang Jiao , Yun Du , Chun Yu
IPC分类号： G06T15/50
CPC分类号： G06T15/005
摘要： Configuration information is used to make a determination to bypass fragment shading by a shader unit of a graphics processing unit, the shader unit capable of performing both vertex shading and fragment shader. Based on the determination, the shader unit performs vertex shading and bypasses fragment shading. A processing element other than the shader unit, such as a pixel blender, can be used to perform some fragment shading. Power is managed to “turn off” power to unused components in a case that fragment shading is bypassed. For example, power can be turned off to a number of arithmetic logic units, the shader unit using the reduced number of arithmetic logic unit to perform vertex shading. At least one register bank of the shader unit can be used as a FIFO buffer storing pixel attribute data for use, with texture data, to fragment shading operations by another processing element.
摘要翻译：配置信息用于确定通过图形处理单元的着色器单元绕过片段着色，着色器单元能够执行顶点着色和片段着色。基于确定，着色器单元执行顶点着色并绕过片段着色。可以使用除着色器单元之外的处理元件，例如像素混合器，以执行某些片段着色。在绕过片段着色的情况下，Power被设计为“关闭”未使用组件的电源。例如，功率可以关闭到多个算术逻辑单元，着色器单元使用减少数量的算术逻辑单元来执行顶点着色。着色器单元的至少一个寄存器组可以用作FIFO缓冲器，其存储与纹理数据一起使用的像素属性数据，以分割另一个处理元件的着色操作。

24. 发明申请

US20080059966A1 DEPENDENT INSTRUCTION THREAD SCHEDULING 有权
标题翻译：相关指令线程调度
公开(公告)号：US20080059966A1
公开(公告)日：2008-03-06
申请号：US11468221
申请日：2006-08-29
申请人： Yun Du , Guofang Jiao , Chun Yu
发明人： Yun Du , Guofang Jiao , Chun Yu
IPC分类号： G06F9/46
CPC分类号： G06F9/3851 , G06F9/3824 , G06F9/3838
摘要： A thread scheduler includes context units for managing the execution of threads where each context unit includes a load reference counter for maintaining a counter value indicative of a difference between a number of data requests and a number of data returns associated with the particular context unit. A context controller of the thread context unit is configured to refrain from forwarding an instruction of a thread when the counter value is nonzero and the instruction includes a data dependency indicator indicating the instruction requires data returned by a previous instruction.
摘要翻译：线程调度器包括用于管理线程执行的上下文单元，其中每个上下文单元包括负载参考计数器，用于维持指示多个数据请求与与特定上下文单元相关联的数据返回数量之间的差异的计数器值。线程上下文单元的上下文控制器被配置为当计数器值非零时避免转发线程的指令，并且该指令包括指示该指令需要先前指令返回的数据的数据依赖指示符。

25. 发明申请

US20080059756A1 RELATIVE ADDRESS GENERATION 有权
标题翻译：相对地址生成
公开(公告)号：US20080059756A1
公开(公告)日：2008-03-06
申请号：US11469347
申请日：2006-08-31
申请人： Yun Du , Chun Yu , Guofang Jiao
发明人： Yun Du , Chun Yu , Guofang Jiao
IPC分类号： G06F12/10
CPC分类号： G06F12/06 , G06F9/345 , G06F9/355 , G06F9/3802 , G06F9/3875
摘要： Techniques to efficiently handle relative addressing are described. In one design, a processor includes an address generator and a storage unit. The address generator receives a relative address comprised of a base address and an offset, obtains a base value for the base address, sums the base value with the offset, and provides an absolute address corresponding to the relative address. The storage unit receives the base address and provides the base value to the address generator. The storage unit also receives the absolute address and provides data at this address. The address generator may derive the absolute address in a first clock cycle of a memory access. The storage unit may provide the data in a second clock cycle of the memory access. The storage unit may have multiple (e.g., two) read ports to support concurrent address generation and data retrieval.
摘要翻译：描述了有效处理相对寻址的技术。在一种设计中，处理器包括地址发生器和存储单元。地址生成器接收由基地址和偏移组成的相对地址，获得基地址的基值，将基本值与偏移量相加，并提供与相对地址对应的绝对地址。存储单元接收基地址并将其提供给地址生成器。存储单元还接收绝对地址，并在该地址处提供数据。地址生成器可以在存储器访问的第一时钟周期中导出绝对地址。存储单元可以在存储器访问的第二时钟周期中提供数据。存储单元可以具有多个（例如两个）读端口，以支持并发地址生成和数据检索。

26. 发明申请

US20080030513A1 Graphics processing unit with extended vertex cache 有权
标题翻译：具有扩展顶点缓存的图形处理单元
公开(公告)号：US20080030513A1
公开(公告)日：2008-02-07
申请号：US11499187
申请日：2006-08-03
申请人： Guofang Jiao , Brian Evan Ruttenberg , Chun Yu , Yun Du
发明人： Guofang Jiao , Brian Evan Ruttenberg , Chun Yu , Yun Du
IPC分类号： G06T1/60
CPC分类号： G06T15/005
摘要： Techniques are described for processing computerized images with a graphics processing unit (GPU) using an extended vertex cache. The techniques include creating an extended vertex cache coupled to a GPU pipeline to reduce an amount of data passing through the GPU pipeline. The GPU pipeline receives an image geometry for an image, and stores attributes for vertices within the image geometry in the extended vertex cache. The GPU pipeline only passes vertex coordinates that identify the vertices and vertex cache index values that indicate storage locations of the attributes for each of the vertices in the extended vertex cache to other processing stages along the GPU pipeline. The techniques described herein defer the setup of attribute gradients to just before attribute interpolation in the GPU pipeline. The vertex attributes may be retrieved from the extended vertex cache for attribute gradient setup just before attribute interpolation in the GPU pipeline.
摘要翻译：描述了使用扩展顶点高速缓存处理具有图形处理单元（GPU）的计算机化图像的技术。这些技术包括创建一个连接到GPU流水线的扩展顶点缓存，以减少通过GPU流水线的数据量。 GPU流水线接收图像的图像几何，并在扩展顶点高速缓存中存储图像几何中的顶点的属性。 GPU流水线仅通过顶点坐标，其顶点和顶点高速缓存索引值指示扩展顶点高速缓存中每个顶点的属性的存储位置，沿着GPU流水线到其他处理阶段。本文描述的技术将属性梯度的设置延迟到GPU管线中的属性插值之前。可以从扩展顶点高速缓存中检索顶点属性，以便在GPU管线中的属性插值之前进行属性梯度设置。

27. 发明申请

US20080028152A1 Tiled cache for multiple software programs 有权
标题翻译：多个软件程序的平铺缓存
公开(公告)号：US20080028152A1
公开(公告)日：2008-01-31
申请号：US11493444
申请日：2006-07-25
申请人： Yun Du , Guofang Jiao , Chun Yu , De Dzwo Hsu
发明人： Yun Du , Guofang Jiao , Chun Yu , De Dzwo Hsu
IPC分类号： G06F12/00 , G06F12/08
CPC分类号： G06F12/0864 , G06F9/3802 , G06F9/3851 , G06F12/0842
摘要： Caching techniques for storing instructions, constant values, and other types of data for multiple software programs are described. A cache provides storage for multiple programs and is partitioned into multiple tiles. Each tile is assignable to one program. Each program may be assigned any number of tiles based on the program's cache usage, the available tiles, and/or other factors. A cache controller identifies the tiles assigned to the programs and generates cache addresses for accessing the cache. The cache may be partitioned into physical tiles. The cache controller may assign logical tiles to the programs and may map the logical tiles to the physical tiles within the cache. The use of logical and physical tiles may simplify assignment and management of the tiles.
摘要翻译：描述用于存储用于多个软件程序的指令，常数值和其他类型的数据的缓存技术。高速缓存为多个程序提供存储，并分区成多个瓦片。每个瓦片可分配给一个程序。可以基于程序的高速缓存使用，可用的瓦片和/或其它因素来为每个程序分配任意数量的瓦片。缓存控制器识别分配给程序的块，并生成用于访问高速缓存的高速缓存地址。缓存可以被划分成物理块。高速缓存控制器可以向程序分配逻辑块，并且可以将逻辑块映射到高速缓存内的物理块。逻辑和物理瓦片的使用可以简化瓦片的分配和管理。

28. 发明授权

US08884972B2 Graphics processor with arithmetic and elementary function units 有权
标题翻译：具有算术和基本功能单元的图形处理器
公开(公告)号：US08884972B2
公开(公告)日：2014-11-11
申请号：US11441696
申请日：2006-05-25
申请人： Yun Du , Guofang Jiao , Chun Yu , Alexei V. Bourd
发明人： Yun Du , Guofang Jiao , Chun Yu , Alexei V. Bourd
IPC分类号： G06F15/16 , G06F15/00 , G06T1/00 , G06F9/38 , G06F9/30
CPC分类号： G06T1/20 , G06F9/30167 , G06F9/383 , G06F9/3851 , G06F9/3885
摘要： A graphics processor capable of efficiently performing arithmetic operations and computing elementary functions is described. The graphics processor has at least one arithmetic logic unit (ALU) that can perform arithmetic operations and at least one elementary function unit that can compute elementary functions. The ALU(s) and elementary function unit(s) may be arranged such that they can operate in parallel to improve throughput. The graphics processor may also include fewer elementary function units than ALUs, e.g., four ALUs and a single elementary function unit. The four ALUs may perform an arithmetic operation on (1) four components of an attribute for one pixel or (2) one component of an attribute for four pixels. The single elementary function unit may operate on one component of one pixel at a time. The use of a single elementary function unit may reduce cost while still providing good performance.
摘要翻译：描述能够有效执行算术运算和计算基本功能的图形处理器。图形处理器具有至少一个可执行算术运算的算术逻辑单元（ALU）和至少一个可以计算基本功能的基本功能单元。 ALU和基本功能单元可以被布置成使得它们可以并行操作以提高吞吐量。图形处理器还可以包括比ALU更少的基本功能单元，例如四个ALU和单个基本功能单元。四个ALU可以对（1）四个像素的属性的四个分量或（2）四个像素的属性的一个分量执行算术运算。单个基本功能单元可以一次操作一个像素的一个分量。使用单个基本功能单元可以降低成本，同时仍然提供良好的性能。

29. 发明授权

US08203564B2 Efficient 2-D and 3-D graphics processing 有权
标题翻译：高效的2-D和3-D图形处理
公开(公告)号：US08203564B2
公开(公告)日：2012-06-19
申请号：US11675662
申请日：2007-02-16
申请人： Guofang Jiao , Angus M. Dorbie , Yun Du , Chun Yu , Jay C. Yun
发明人： Guofang Jiao , Angus M. Dorbie , Yun Du , Chun Yu , Jay C. Yun
IPC分类号： G06T1/20 , G06T1/00 , G06T15/40
CPC分类号： G06T15/005 , G06T11/40 , G09G5/363
摘要： Techniques for supporting both 2-D and 3-D graphics are described. A graphics processing unit (GPU) may perform 3-D graphics processing in accordance with a 3-D graphics pipeline to render 3-D images and may also perform 2-D graphics processing in accordance with a 2-D graphics pipeline to render 2-D images. Each stage of the 2-D graphics pipeline may be mapped to at least one stage of the 3-D graphics pipeline. For example, a clipping, masking and scissoring stage in 2-D graphics may be mapped to a depth test stage in 3-D graphics. Coverage values for pixels within paths in 2-D graphics may be determined using rasterization and depth test stages in 3-D graphics. A paint generation stage and an image interpolation stage in 2-D graphics may be mapped to a fragment shader stage in 3-D graphics. A blending stage in 2-D graphics may be mapped to a blending stage in 3-D graphics.
摘要翻译：描述了支持2-D和3-D图形的技术。图形处理单元（GPU）可以根据3-D图形流水线执行3D图形处理以渲染3-D图像，并且还可以根据2-D图形流水线执行2-D图形处理以呈现2 -D图像。 2-D图形管线的每个阶段可以映射到3-D图形流水线的至少一个阶段。例如，2-D图形中的裁剪，掩蔽和裁剪阶段可以映射到3D图形中的深度测试阶段。 2-D图形中路径内像素的覆盖值可以使用3-D图形中的光栅化和深度测试阶段来确定。 2-D图形中的油漆生成阶段和图像插值阶段可以映射到3-D图形中的片段着色器阶段。 2-D图形中的混合阶段可以映射到3-D图形的混合阶段。

30. 发明授权

US08009172B2 Graphics processing unit with shared arithmetic logic unit 有权
标题翻译：具有共享算术逻辑单元的图形处理单元
公开(公告)号：US08009172B2
公开(公告)日：2011-08-30
申请号：US11550344
申请日：2006-10-17
申请人： Guofang Jiao , Brian Ruttenberg , Chun Yu , Yun Du
发明人： Guofang Jiao , Brian Ruttenberg , Chun Yu , Yun Du
IPC分类号： G06T1/20
CPC分类号： G06T15/005
摘要： This disclosure describes a graphics processing unit (GPU) pipeline that uses one or more shared arithmetic logic units (ALUs). In order to facilitate such sharing of ALUs, the stages of the disclosed GPU pipeline may be rearranged relative to conventional GPU pipelines. In addition, by rearranging the stages of the GPU pipeline, efficiencies may be achieved in the image processing. Unlike conventional GPU pipelines, for example, an attribute gradient setup stage can be located much later in the pipeline, and the attribute interpolator stage may immediately follow the attribute gradient setup stage. This allows sharing of an ALU by the attribute gradient setup and attribute interpolator stages. Several other techniques and features for the GPU pipeline are also described, which may improve performance and possibly achieve additional processing efficiencies.
摘要翻译：本公开描述了使用一个或多个共享算术逻辑单元（ALU）的图形处理单元（GPU）流水线。为了促进ALU的这种共享，所公开的GPU流水线的阶段可以相对于传统的GPU管线重新排列。此外，通过重新排列GPU流水线的各个阶段，可以在图像处理中实现效率。与传统GPU流水线不同，例如，属性梯度建立阶段可以在流水线后面定位，属性内插器阶段可以立即跟随属性梯度建立阶段。这允许通过属性渐变设置和属性内插器阶段共享ALU。还描述了用于GPU流水线的若干其它技术和特征，这可以提高性能并可能实现额外的处理效率。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式