会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 51. 发明申请
    • Support for Non-Local Returns in Parallel Thread SIMD Engine
    • 支持并行线程SIMD引擎中的非本地返回
    • US20110078418A1
    • 2011-03-31
    • US12881065
    • 2010-09-13
    • Guillermo Juan RozasBrett W. Coon
    • Guillermo Juan RozasBrett W. Coon
    • G06F9/38
    • G06F9/30058G06F9/3851
    • One embodiment of the present invention sets forth a method for executing a non-local return instruction in a parallel thread processor. The method comprises the steps of receiving, within the thread group, a first long jump instruction and, in response, popping a first token from the execution stack. The method also comprises determining whether the first token is a first long jump token that was pushed onto the execution stack when a first push instruction associated with the first long jump instruction was executed, and when the first token is the first long jump token, jumping to the second instruction based on the address specified by the first long jump token, or, when the first token is not the first long jump token, disabling the active thread until the first long jump token is popped from the execution stack.
    • 本发明的一个实施例提出了一种用于在并行线程处理器中执行非本地返回指令的方法。 该方法包括以下步骤:在线程组内接收第一长跳转指令,作为响应,从执行堆栈中弹出第一个令牌。 该方法还包括当与第一长跳转指令相关联的第一推送指令被执行时,确定第一令牌是否是被推送到执行堆栈上的第一长跳转令牌,以及当第一令牌是第一长跳转令牌时,跳转 基于由第一长跳转令牌指定的地址到第二指令,或者当第一令牌不是第一长跳转令牌时,禁用活动线程,直到从执行堆栈弹出第一个长跳转令牌。
    • 54. 发明授权
    • Register based queuing for texture requests
    • 基于注册排队的纹理请求
    • US07864185B1
    • 2011-01-04
    • US12256848
    • 2008-10-23
    • John Erik LindholmJohn R. NickollsSimon S. MoyBrett W. Coon
    • John Erik LindholmJohn R. NickollsSimon S. MoyBrett W. Coon
    • G06T11/40G06T15/00G06T15/20G06T1/00
    • G06T11/60G09G5/363
    • A graphics processing unit can queue a large number of texture requests to balance out the variability of texture requests without the need for a large texture request buffer. A dedicated texture request buffer queues the relatively small texture commands and parameters. Additionally, for each queued texture command, an associated set of texture arguments, which are typically much larger than the texture command, are stored in a general purpose register. The texture unit retrieves texture commands from the texture request buffer and then fetches the associated texture arguments from the appropriate general purpose register. The texture arguments may be stored in the general purpose register designated as the destination of the final texture value computed by the texture unit. Because the destination register must be allocated for the final texture value as texture commands are queued, storing the texture arguments in this register does not consume any additional registers.
    • 图形处理单元可以排队大量纹理请求,以平衡纹理请求的可变性,而不需要大的纹理请求缓冲区。 专用纹理请求缓冲区排队相对较小的纹理命令和参数。 另外,对于每个排队的纹理命令,通常比纹理命令大得多的一组相关的纹理参数存储在通用寄存器中。 纹理单元从纹理请求缓冲区中检索纹理命令,然后从相应的通用寄存器获取相关的纹理参数。 纹理参数可以存储在指定为由纹理单元计算的最终纹理值的目的地的通用寄存器中。 因为当纹理命令排队时,必须为目标寄存器分配最终纹理值,所以将纹理参数存储在该寄存器中不消耗任何其他寄存器。
    • 55. 发明授权
    • Multi-threaded stack cache
    • 多线程堆栈缓存
    • US07805573B1
    • 2010-09-28
    • US11313448
    • 2005-12-20
    • Brett W. Coon
    • Brett W. Coon
    • G06F12/00G06F13/00G06F13/28
    • G06F12/0875G06F12/0842
    • Systems and methods for storing stack data for multi-threaded processing in a specialized cache reduce on-chip memory requirements while maintaining low access latency. An on-chip stack cache is used store a predetermined number of stack entries for a thread. When additional entries are needed for the thread, entries stored in the stack cache are spilled, i.e., moved, to remote memory. As entries are popped off the on-chip stack cache, spilled entries are restored from the remote memory. The spilling and restoring processes may be performed while the on-chip stack cache is accessed. Therefore, a large stack size is supported using a smaller amount of die area than that needed to store the entire large stack on-chip. The large stack may be accessed without incurring the latency of reading and writing to remote memory since the stack cache is preemptively spilled and restored.
    • 用于在专用高速缓存中存储用于多线程处理的堆栈数据的系统和方法降低了片上存储器需求,同时保持低访问延迟。 使用片上堆栈高速缓存来存储线程的预定数量的堆栈条目。 当线程需要额外的条目时,存储在堆栈高速缓存中的条目将被溢出,即移动到远程存储器。 当条目从片上堆栈高速缓存中弹出时,从远程存储器恢复溢出的条目。 可以在访问片上堆栈高速缓存时执行溢出和恢复过程。 因此,使用比存储芯片上整个大堆栈所需的更小的管芯面积来支持大的堆叠尺寸。 可以访问大堆栈,而不会导致读取和写入远程内存的延迟,因为堆栈高速缓存被抢先溢出并恢复。
    • 57. 发明授权
    • Register file allocation
    • 注册文件分配
    • US07634621B1
    • 2009-12-15
    • US11556677
    • 2006-11-03
    • Brett W. CoonJohn Erik LindholmGary TarolliSvetoslav D. TzvetkovJohn R. NickollsMing Y. Siu
    • Brett W. CoonJohn Erik LindholmGary TarolliSvetoslav D. TzvetkovJohn R. NickollsMing Y. Siu
    • G06F12/00
    • G06F9/3012G06F9/30123G06F9/3824G06F9/3851G06F9/3885G06F12/0223Y02D10/13
    • Circuits, methods, and apparatus that provide the die area and power savings of a single-ported memory with the performance advantages of a multiported memory. One example provides register allocation methods for storing data in a multiple-bank register file. In a thin register allocation method, data for a process is stored in a single bank. In this way, different processes use different banks to avoid conflicts. In a fat register allocation method, processes store data in each bank. In this way, if one process uses a large number of registers, those registers are spread among the banks, avoiding a situation where one bank is filled and other processes are forced to share a reduced number of banks. In a hybrid register allocation method, processes store data in more than one bank, but fewer than all the banks. Each of these methods may be combined in varying ways.
    • 提供具有多端口存储器性能优势的单端口存储器的管芯面积和功率节省的电路,方法和装置。 一个示例提供用于将数据存储在多存储器寄存器文件中的寄存器分配方法。 在一个薄的寄存器分配方法中,一个进程的数据被存储在一个单独的存储单元中。 以这种方式,不同的流程使用不同的银行来避免冲突。 在胖寄存器分配方法中,处理将数据存储在每个存储区中。 这样一来,如果一个进程使用大量的寄存器,这些寄存器就会在银行之间传播,避免了一个银行被填满的情况,而其他进程被迫分担一个数量减少的银行。 在混合寄存器分配方法中,处理将数据存储在多个银行中,但少于所有银行。 这些方法中的每一种可以以不同的方式组合。