会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 11. 发明申请
    • Graphics processing unit with extended vertex cache
    • 具有扩展顶点缓存的图形处理单元
    • US20080030513A1
    • 2008-02-07
    • US11499187
    • 2006-08-03
    • Guofang JiaoBrian Evan RuttenbergChun YuYun Du
    • Guofang JiaoBrian Evan RuttenbergChun YuYun Du
    • G06T1/60
    • G06T15/005
    • Techniques are described for processing computerized images with a graphics processing unit (GPU) using an extended vertex cache. The techniques include creating an extended vertex cache coupled to a GPU pipeline to reduce an amount of data passing through the GPU pipeline. The GPU pipeline receives an image geometry for an image, and stores attributes for vertices within the image geometry in the extended vertex cache. The GPU pipeline only passes vertex coordinates that identify the vertices and vertex cache index values that indicate storage locations of the attributes for each of the vertices in the extended vertex cache to other processing stages along the GPU pipeline. The techniques described herein defer the setup of attribute gradients to just before attribute interpolation in the GPU pipeline. The vertex attributes may be retrieved from the extended vertex cache for attribute gradient setup just before attribute interpolation in the GPU pipeline.
    • 描述了使用扩展顶点高速缓存处理具有图形处理单元(GPU)的计算机化图像的技术。 这些技术包括创建一个连接到GPU流水线的扩展顶点缓存,以减少通过GPU流水线的数据量。 GPU流水线接收图像的图像几何,并在扩展顶点高速缓存中存储图像几何中的顶点的属性。 GPU流水线仅通过顶点坐标,其顶点和顶点高速缓存索引值指示扩展顶点高速缓存中每个顶点的属性的存储位置,沿着GPU流水线到其他处理阶段。 本文描述的技术将属性梯度的设置延迟到GPU管线中的属性插值之前。 可以从扩展顶点高速缓存中检索顶点属性,以便在GPU管线中的属性插值之前进行属性梯度设置。
    • 12. 发明申请
    • Tiled cache for multiple software programs
    • 多个软件程序的平铺缓存
    • US20080028152A1
    • 2008-01-31
    • US11493444
    • 2006-07-25
    • Yun DuGuofang JiaoChun YuDe Dzwo Hsu
    • Yun DuGuofang JiaoChun YuDe Dzwo Hsu
    • G06F12/00G06F12/08
    • G06F12/0864G06F9/3802G06F9/3851G06F12/0842
    • Caching techniques for storing instructions, constant values, and other types of data for multiple software programs are described. A cache provides storage for multiple programs and is partitioned into multiple tiles. Each tile is assignable to one program. Each program may be assigned any number of tiles based on the program's cache usage, the available tiles, and/or other factors. A cache controller identifies the tiles assigned to the programs and generates cache addresses for accessing the cache. The cache may be partitioned into physical tiles. The cache controller may assign logical tiles to the programs and may map the logical tiles to the physical tiles within the cache. The use of logical and physical tiles may simplify assignment and management of the tiles.
    • 描述用于存储用于多个软件程序的指令,常数值和其他类型的数据的缓存技术。 高速缓存为多个程序提供存储,并分区成多个瓦片。 每个瓦片可分配给一个程序。 可以基于程序的高速缓存使用,可用的瓦片和/或其它因素来为每个程序分配任意数量的瓦片。 缓存控制器识别分配给程序的块,并生成用于访问高速缓存的高速缓存地址。 缓存可以被划分成物理块。 高速缓存控制器可以向程序分配逻辑块,并且可以将逻辑块映射到高速缓存内的物理块。 逻辑和物理瓦片的使用可以简化瓦片的分配和管理。
    • 13. 发明授权
    • Graphics processor with arithmetic and elementary function units
    • 具有算术和基本功能单元的图形处理器
    • US08884972B2
    • 2014-11-11
    • US11441696
    • 2006-05-25
    • Yun DuGuofang JiaoChun YuAlexei V. Bourd
    • Yun DuGuofang JiaoChun YuAlexei V. Bourd
    • G06F15/16G06F15/00G06T1/00G06F9/38G06F9/30
    • G06T1/20G06F9/30167G06F9/383G06F9/3851G06F9/3885
    • A graphics processor capable of efficiently performing arithmetic operations and computing elementary functions is described. The graphics processor has at least one arithmetic logic unit (ALU) that can perform arithmetic operations and at least one elementary function unit that can compute elementary functions. The ALU(s) and elementary function unit(s) may be arranged such that they can operate in parallel to improve throughput. The graphics processor may also include fewer elementary function units than ALUs, e.g., four ALUs and a single elementary function unit. The four ALUs may perform an arithmetic operation on (1) four components of an attribute for one pixel or (2) one component of an attribute for four pixels. The single elementary function unit may operate on one component of one pixel at a time. The use of a single elementary function unit may reduce cost while still providing good performance.
    • 描述能够有效执行算术运算和计算基本功能的图形处理器。 图形处理器具有至少一个可执行算术运算的算术逻辑单元(ALU)和至少一个可以计算基本功能的基本功能单元。 ALU和基本功能单元可以被布置成使得它们可以并行操作以提高吞吐量。 图形处理器还可以包括比ALU更少的基本功能单元,例如四个ALU和单个基本功能单元。 四个ALU可以对(1)四个像素的属性的四个分量或(2)四个像素的属性的一个分量执行算术运算。 单个基本功能单元可以一次操作一个像素的一个分量。 使用单个基本功能单元可以降低成本,同时仍然提供良好的性能。
    • 14. 发明授权
    • Graphics processing unit with deferred vertex shading
    • 具有延迟顶点着色的图形处理单元
    • US08436854B2
    • 2013-05-07
    • US12557427
    • 2009-09-10
    • Guofang JiaoYun DuLingjun ChenChun Yu
    • Guofang JiaoYun DuLingjun ChenChun Yu
    • G06T15/40
    • G06T15/40G06T1/20G06T15/005
    • Techniques are described for processing graphics images with a graphics processing unit (GPU) using deferred vertex shading. An example method includes the following: generating, within a processing pipeline of a graphics processing unit (GPU), vertex coordinates for vertices of each primitive within an image geometry, wherein the vertex coordinates comprise a location and a perspective parameter for each one of the vertices, and wherein the image geometry represents a graphics image; identifying, within the processing pipeline of the GPU, visible primitives within the image geometry based upon the vertex coordinates; and, responsive to identifying the visible primitives, generating, within the processing pipeline of the GPU, vertex attributes only for the vertices of the visible primitives in order to determine surface properties of the graphics image.
    • 描述了使用延迟顶点着色处理具有图形处理单元(GPU)的图形图像的技术。 示例性方法包括以下:在图形处理单元(GPU)的处理流水线内生成图像几何中每个图元的顶点的顶点坐标,其中顶点坐标包括位置和透视参数 顶点,并且其中图像几何表示图形图像; 在GPU的处理流水线内识别基于顶点坐标的图像几何图形内的可见原始图形; 并且响应于识别可见原语,在GPU的处理流水线内生成仅针对可见图元的顶点的顶点属性,以便确定图形图像的表面特性。
    • 15. 发明授权
    • 3-D clipping in a graphics processing unit
    • 图形处理单元中的3-D剪辑
    • US08212840B2
    • 2012-07-03
    • US11551900
    • 2006-10-23
    • Guofang JiaoChun YuLingjun ChenYun Du
    • Guofang JiaoChun YuLingjun ChenYun Du
    • G09G5/00
    • G06T1/20G06T11/40G06T11/60G06T15/005G06T15/30G06T19/00G09G5/393
    • A graphics processing unit (GPU) efficiently performs 3-dimensional (3-D) clipping using processing units used for other graphics functions. The GPU includes first and second hardware units and at least one buffer. The first hardware unit performs 3-D clipping of primitives using a first processing unit used for a first graphics function, e.g., an ALU used for triangle setup, depth gradient setup, etc. The first hardware unit may perform 3-D clipping by (a) computing clip codes for each vertex of each primitive, (b) determining whether to pass, discard or clip each primitive based on the clip codes for all vertices of the primitive, and (c) clipping each primitive to be clipped against clipping planes. The second hardware unit computes attribute component values for new vertices resulting from the 3-D clipping, e.g., using an ALU used for attribute gradient setup, attribute interpolation, etc. The buffer(s) store intermediate results of the 3-D clipping.
    • 图形处理单元(GPU)使用用于其他图形功能的处理单元有效地执行三维(3-D)剪辑。 GPU包括第一和第二硬件单元和至少一个缓冲器。 第一硬件单元使用用于第一图形功能的第一处理单元(例如用于三角形设置的ALU,深度梯度设置等)来对原语执行3-D限幅。第一硬件单元可以通过( a)计算每个图元的每个顶点的剪辑代码,(b)基于所述基元的所有顶点的剪辑代码来确定是否传递,丢弃或剪切每个图元,以及(c)剪切要针对剪切平面剪切的每个图元 。 第二硬件单元计算由3-D限幅产生的新顶点的属性分量值,例如使用用于属性梯度设置,属性插值等的ALU。该缓冲器存储3-D限幅的中间结果。
    • 16. 发明授权
    • Graphics processing unit with shared arithmetic logic unit
    • 具有共享算术逻辑单元的图形处理单元
    • US08009172B2
    • 2011-08-30
    • US11550344
    • 2006-10-17
    • Guofang JiaoBrian RuttenbergChun YuYun Du
    • Guofang JiaoBrian RuttenbergChun YuYun Du
    • G06T1/20
    • G06T15/005
    • This disclosure describes a graphics processing unit (GPU) pipeline that uses one or more shared arithmetic logic units (ALUs). In order to facilitate such sharing of ALUs, the stages of the disclosed GPU pipeline may be rearranged relative to conventional GPU pipelines. In addition, by rearranging the stages of the GPU pipeline, efficiencies may be achieved in the image processing. Unlike conventional GPU pipelines, for example, an attribute gradient setup stage can be located much later in the pipeline, and the attribute interpolator stage may immediately follow the attribute gradient setup stage. This allows sharing of an ALU by the attribute gradient setup and attribute interpolator stages. Several other techniques and features for the GPU pipeline are also described, which may improve performance and possibly achieve additional processing efficiencies.
    • 本公开描述了使用一个或多个共享算术逻辑单元(ALU)的图形处理单元(GPU)流水线。 为了促进ALU的这种共享,所公开的GPU流水线的阶段可以相对于传统的GPU管线重新排列。 此外,通过重新排列GPU流水线的各个阶段,可以在图像处理中实现效率。 与传统GPU流水线不同,例如,属性梯度建立阶段可以在流水线后面定位,属性内插器阶段可以立即跟随属性梯度建立阶段。 这允许通过属性渐变设置和属性内插器阶段共享ALU。 还描述了用于GPU流水线的若干其它技术和特征,这可以提高性能并可能实现额外的处理效率。
    • 17. 发明申请
    • PROGRAMMABLE BLENDING IN A GRAPHICS PROCESSING UNIT
    • 图形处理单元中的可编程混合
    • US20080094410A1
    • 2008-04-24
    • US11550958
    • 2006-10-19
    • Guofang JiaoChun YuLingjun ChenYun Du
    • Guofang JiaoChun YuLingjun ChenYun Du
    • G09G5/02
    • G06T15/503G06T2210/32
    • Techniques for implementing blending equations for various blending modes with a base set of operations are described. Each blending equation may be decomposed into a sequence of operations. In one design, a device includes a processing unit that implements a set of operations for multiple blending modes and a storage unit that stores operands and results. The processing unit receives a sequence of instructions for a sequence of operations for a blending mode selected from the plurality of blending modes and executes each instruction in the sequence to perform blending in accordance with the selected blending mode. The processing unit may include (a) an ALU that performs at least one operation in the base set, e.g., a dot product, (b) a pre-formatting unit that performs gamma correction and alpha scaling of inbound color values, and (c) a post-formatting unit that performs gamma compression and alpha scaling of outbound color values.
    • 描述了用于具有基本操作集合的用于各种混合模式的混合方程的技术。 每个混合方程可以分解为一系列操作。 在一种设计中,设备包括一个处理单元,该处理单元实现多种混合模式的一组操作,以及存储操作数和结果的存储单元。 处理单元接收用于从多个混合模式中选择的混合模式的操作序列的指令序列,并且执行该顺序中的每个指令以根据所选择的混合模式执行混合。 处理单元可以包括(a)执行基本集合中的至少一个操作的ALU,例如点积,(b)执行伽马校正和入站颜色值的α缩放的预格式化单元,以及(c )一个后格式化单元,用于执行出色色彩值的伽玛压缩和alpha缩放。
    • 18. 发明申请
    • Graphics Processors With Parallel Scheduling and Execution of Threads
    • 具有并行调度和执行线程的图形处理器
    • US20080074433A1
    • 2008-03-27
    • US11533880
    • 2006-09-21
    • Guofang JiaoYun DuChun Yu
    • Guofang JiaoYun DuChun Yu
    • G06T1/00
    • G06T15/005
    • A graphics processor capable of parallel scheduling and execution of multiple threads, and techniques for achieving parallel scheduling and execution, are described. The graphics processor may include multiple hardware units and a scheduler. The hardware units are operable in parallel, with each hardware unit supporting a respective set of operations. The hardware units may include an ALU core, an elementary function core, a logic core, a texture sampler, a load control unit, some other hardware unit, or a combination thereof. The scheduler dispatches instructions for multiple threads to the hardware units concurrently. The graphics processor may further include an instruction cache to store instructions for threads and register banks to store data. The instruction cache and register banks may be shared by the hardware units.
    • 描述了能够并行调度和执行多个线程的图形处理器以及用于实现并行调度和执行的技术。 图形处理器可以包括多个硬件单元和调度器。 硬件单元可并行操作,每个硬件单元支持相应的一组操作。 硬件单元可以包括ALU核,基本功能核心,逻辑核心,纹理采样器,负载控制单元,一些其他硬件单元或其组合。 调度器将多个线程的指令同时分配到硬件单元。 图形处理器还可以包括指令高速缓存以存储线程和寄存器组以存储数据的指令。 指令高速缓存和寄存器组可以由硬件单元共享。
    • 19. 发明申请
    • Multi-stage floating-point accumulator
    • 多级浮点累加器
    • US20080046495A1
    • 2008-02-21
    • US11506349
    • 2006-08-18
    • Yun DuChun YuGuofang Jiao
    • Yun DuChun YuGuofang Jiao
    • G06F7/38
    • G06F7/5095G06F5/012G06F7/485G06F7/49936G06F2207/3884
    • A multi-stage floating-point accumulator includes at least two stages and is capable of operating at higher speed. In one design, the floating-point accumulator includes first and second stages. The first stage includes three operand alignment units, two multiplexers, and three latches. The three operand alignment units operate on a current floating-point value, a prior floating-point value, and a prior accumulated value. A first multiplexer provides zero or the prior floating-point value to the second operand alignment unit. A second multiplexer provides zero or the prior accumulated value to the third operand alignment unit. The three latches couple to the three operand alignment units. The second stage includes a 3-operand adder to sum the operands generated by the three operand alignment units, a latch, and a post alignment unit.
    • 多级浮点累加器包括至少两级,并且能够以更高的速度运行。 在一种设计中,浮点累加器包括第一级和第二级。 第一级包括三个操作对准单元,两个多路复用器和三个锁存器。 三个操作数对齐单元以当前浮点值,先前浮点值和先前累加值操作。 第一多路复用器为第二操作数对准单元提供零或先前的浮点值。 第二多路复用器为第三操作数对准单元提供零或先前的累加值。 三个锁存器耦合到三个操作数对齐单元。 第二级包括一个3操作数加法器,用于对由三个操作数对齐单元产生的操作数,一个锁存器和一个后置对准单元求和。
    • 20. 发明申请
    • GRAPHICS PROCESSING UNIT WITH SHARED ARITHMETIC LOGIC UNIT
    • 具有共享算术逻辑单元的图形处理单元
    • US20080030512A1
    • 2008-02-07
    • US11550344
    • 2006-10-17
    • Guofang JiaoBrian RuttenbergChun YuYun Du
    • Guofang JiaoBrian RuttenbergChun YuYun Du
    • G06T1/20
    • G06T15/005
    • This disclosure describes a graphics processing unit (GPU) pipeline that uses one or more shared arithmetic logic units (ALUs). In order to facilitate such sharing of ALUs, the stages of the disclosed GPU pipeline may be rearranged relative to conventional GPU pipelines. In addition, by rearranging the stages of the GPU pipeline, efficiencies may be achieved in the image processing. Unlike conventional GPU pipelines, for example, an attribute gradient setup stage can be located much later in the pipeline, and the attribute interpolator stage may immediately follow the attribute gradient setup stage. This allows sharing of an ALU by the attribute gradient setup and attribute interpolator stages. Several other techniques and features for the GPU pipeline are also described, which may improve performance and possibly achieve additional processing efficiencies.
    • 本公开描述了使用一个或多个共享算术逻辑单元(ALU)的图形处理单元(GPU)流水线。 为了促进ALU的这种共享,所公开的GPU流水线的阶段可以相对于常规GPU流水线重新排列。 此外,通过重新排列GPU流水线的各个阶段,可以在图像处理中实现效率。 与传统GPU流水线不同,例如,属性梯度建立阶段可以在流水线后面定位,并且属性内插器阶段可以立即跟随属性梯度建立阶段。 这允许通过属性渐变设置和属性内插器阶段共享ALU。 还描述了用于GPU流水线的若干其它技术和特征,这可以提高性能并可能实现额外的处理效率。