专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明申请

US20110078367A1 CONFIGURABLE CACHE FOR MULTIPLE CLIENTS 有权
标题翻译：多个客户端的可配置缓存
公开(公告)号：US20110078367A1
公开(公告)日：2011-03-31
申请号：US12567445
申请日：2009-09-25
申请人： Alexander L. Minkin , Steven James Heinrich , RaJeshwaren Selvanesan , Brett W. Coon , Charles McCarver , Anjana Rajendran , Stewart G. Carlton
发明人： Alexander L. Minkin , Steven James Heinrich , RaJeshwaren Selvanesan , Brett W. Coon , Charles McCarver , Anjana Rajendran , Stewart G. Carlton
IPC分类号： G06F12/02 , G06F12/00 , G06F12/08
CPC分类号： G06F12/084 , G06F2212/2515 , G06F2212/301 , G06F2212/6012
摘要： One embodiment of the present invention sets forth a technique for providing a L1 cache that is a central storage resource. The L1 cache services multiple clients with diverse latency and bandwidth requirements. The L1 cache may be reconfigured to create multiple storage spaces enabling the L1 cache may replace dedicated buffers, caches, and FIFOs in previous architectures. A “direct mapped” storage region that is configured within the L1 cache may replace dedicated buffers, FIFOs, and interface paths, allowing clients of the L1 cache to exchange attribute and primitive data. The direct mapped storage region may used as a global register file. A “local and global cache” storage region configured within the L1 cache may be used to support load/store memory requests to multiple spaces. These spaces include global, local, and call-return stack (CRS) memory.
摘要翻译：本发明的一个实施例提出了一种用于提供作为中央存储资源的L1高速缓存的技术。 L1缓存为多个客户端提供不同的延迟和带宽要求。可以重新配置L1高速缓存以创建多个存储空间，使得L1高速缓存可以替代先前架构中的专用缓冲器，高速缓存和FIFO。配置在L1高速缓存内的“直接映射”存储区可以替代专用缓冲器，FIFO和接口路径，允许L1高速缓存的客户端交换属性和原始数据。直接映射存储区域可以用作全局寄存器文件。配置在L1高速缓存内的“本地和全局高速缓存”存储区域可用于支持对多个空间的加载/存储存储器请求。这些空格包括全局，本地和回调栈（CRS）内存。

2. 发明授权

US08595425B2 Configurable cache for multiple clients 有权
标题翻译：多个客户端的可配置缓存
公开(公告)号：US08595425B2
公开(公告)日：2013-11-26
申请号：US12567445
申请日：2009-09-25
申请人： Alexander L. Minkin , Steven James Heinrich , RaJeshwaran Selvanesan , Brett W. Coon , Charles McCarver , Anjana Rajendran , Stewart G. Carlton
发明人： Alexander L. Minkin , Steven James Heinrich , RaJeshwaran Selvanesan , Brett W. Coon , Charles McCarver , Anjana Rajendran , Stewart G. Carlton
IPC分类号： G06F12/00
CPC分类号： G06F12/084 , G06F2212/2515 , G06F2212/301 , G06F2212/6012
摘要： One embodiment of the present invention sets forth a technique for providing a L1 cache that is a central storage resource. The L1 cache services multiple clients with diverse latency and bandwidth requirements. The L1 cache may be reconfigured to create multiple storage spaces enabling the L1 cache may replace dedicated buffers, caches, and FIFOs in previous architectures. A “direct mapped” storage region that is configured within the L1 cache may replace dedicated buffers, FIFOs, and interface paths, allowing clients of the L1 cache to exchange attribute and primitive data. The direct mapped storage region may used as a global register file. A “local and global cache” storage region configured within the L1 cache may be used to support load/store memory requests to multiple spaces. These spaces include global, local, and call-return stack (CRS) memory.
摘要翻译：本发明的一个实施例提出了一种用于提供作为中央存储资源的L1高速缓存的技术。 L1缓存为多个客户端提供不同的延迟和带宽要求。可以重新配置L1高速缓存以创建多个存储空间，使得L1高速缓存可以替代先前架构中的专用缓冲器，高速缓存和FIFO。配置在L1高速缓存内的“直接映射”存储区可以替代专用缓冲器，FIFO和接口路径，允许L1高速缓存的客户端交换属性和原始数据。直接映射存储区域可以用作全局寄存器文件。配置在L1高速缓存内的“本地和全局高速缓存”存储区域可用于支持对多个空间的加载/存储存储器请求。这些空格包括全局，本地和回调栈（CRS）内存。

3. 发明申请

US20110078381A1 Cache Operations and Policies For A Multi-Threaded Client 有权
标题翻译：多线程客户端的缓存操作和策略
公开(公告)号：US20110078381A1
公开(公告)日：2011-03-31
申请号：US12890476
申请日：2010-09-24
申请人： Steven James HEINRICH , Alexander L. Minkin , Brett W. Coon , Rajeshwaran Selvanesan , Robert Steven Glanville , Charles McCarver , Anjana Rajendran , Stewart Glenn Carlton , John R. Nickolls , Brian Fahs
发明人： Steven James HEINRICH , Alexander L. Minkin , Brett W. Coon , Rajeshwaran Selvanesan , Robert Steven Glanville , Charles McCarver , Anjana Rajendran , Stewart Glenn Carlton , John R. Nickolls , Brian Fahs
IPC分类号： G06F12/08 , G06F12/00
CPC分类号： G06F12/0842 , G06F12/0897
摘要： A method for managing a parallel cache hierarchy in a processing unit. The method including receiving an instruction that includes a cache operations modifier that identifies a level of the parallel cache hierarchy in which to cache data associated with the instruction; and implementing a cache replacement policy based on the cache operations modifier.
摘要翻译：一种用于在处理单元中管理并行高速缓存层级的方法。该方法包括接收包括高速缓存操作修饰符的指令，该缓存操作修饰符标识其中要缓存与指令相关联的数据的并行高速缓存层级的级别; 并基于高速缓存操作修饰符实现高速缓存替换策略。

4. 发明授权

US09952977B2 Cache operations and policies for a multi-threaded client 有权
公开(公告)号：US09952977B2
公开(公告)日：2018-04-24
申请号：US12890476
申请日：2010-09-24
申请人： Steven James Heinrich , Alexander L. Minkin , Brett W. Coon , Rajeshwaran Selvanesan , Robert Steven Glanville , Charles McCarver , Anjana Rajendran , Stewart Glenn Carlton , John R. Nickolls , Brian Fahs
发明人： Steven James Heinrich , Alexander L. Minkin , Brett W. Coon , Rajeshwaran Selvanesan , Robert Steven Glanville , Charles McCarver , Anjana Rajendran , Stewart Glenn Carlton , John R. Nickolls , Brian Fahs
IPC分类号： G06F12/00 , G06F12/0842 , G06F12/0897
CPC分类号： G06F12/0842 , G06F12/0897
摘要： A method for managing a parallel cache hierarchy in a processing unit. The method including receiving an instruction that includes a cache operations modifier that identifies a level of the parallel cache hierarchy in which to cache data associated with the instruction; and implementing a cache replacement policy based on the cache operations modifier.

5. 发明授权

US09223578B2 Coalescing memory barrier operations across multiple parallel threads 有权
标题翻译：在多个并行线程之间合并记忆障碍操作
公开(公告)号：US09223578B2
公开(公告)日：2015-12-29
申请号：US12887081
申请日：2010-09-21
申请人： John R. Nickolls , Steven James Heinrich , Brett W. Coon , Michael C. Shebanow
发明人： John R. Nickolls , Steven James Heinrich , Brett W. Coon , Michael C. Shebanow
IPC分类号： G06F9/46 , G06F9/38 , G06F9/30
CPC分类号： G06F9/3834 , G06F9/3004 , G06F9/30087 , G06F9/3851
摘要： One embodiment of the present invention sets forth a technique for coalescing memory barrier operations across multiple parallel threads. Memory barrier requests from a given parallel thread processing unit are coalesced to reduce the impact to the rest of the system. Additionally, memory barrier requests may specify a level of a set of threads with respect to which the memory transactions are committed. For example, a first type of memory barrier instruction may commit the memory transactions to a level of a set of cooperating threads that share an L1 (level one) cache. A second type of memory barrier instruction may commit the memory transactions to a level of a set of threads sharing a global memory. Finally, a third type of memory barrier instruction may commit the memory transactions to a system level of all threads sharing all system memories. The latency required to execute the memory barrier instruction varies based on the type of memory barrier instruction.
摘要翻译：本发明的一个实施例提出了一种用于在多个并行线程之间聚合存储器屏障操作的技术。来自给定并行线程处理单元的存储器屏障请求被合并以减少对系统其余部分的影响。此外，存储器屏障请求可以指定针对其提交内存事务的一组线程的级别。例如，第一类型的存储器障碍指令可以将存储器事务提交到共享L1（一级）高速缓存的一组协作线程的级别。第二种类型的存储器障碍指令可以将存储器事务提交到共享全局存储器的一组线程的级别。最后，第三种类型的存储器障碍指令可以将存储器事务提交到共享所有系统存储器的所有线程的系统级。执行存储器屏障指令所需的延迟基于存储器屏障指令的类型而变化。

6. 发明申请

US20110078692A1 COALESCING MEMORY BARRIER OPERATIONS ACROSS MULTIPLE PARALLEL THREADS 有权
标题翻译：通过多个并行线程来解决存储器障碍操作
公开(公告)号：US20110078692A1
公开(公告)日：2011-03-31
申请号：US12887081
申请日：2010-09-21
申请人： John R. NICKOLLS , Steven James Heinrich , Brett W. Coon , Michael C. Shebanow
发明人： John R. NICKOLLS , Steven James Heinrich , Brett W. Coon , Michael C. Shebanow
IPC分类号： G06F9/46
CPC分类号： G06F9/3834 , G06F9/3004 , G06F9/30087 , G06F9/3851
摘要： One embodiment of the present invention sets forth a technique for coalescing memory barrier operations across multiple parallel threads. Memory barrier requests from a given parallel thread processing unit are coalesced to reduce the impact to the rest of the system. Additionally, memory barrier requests may specify a level of a set of threads with respect to which the memory transactions are committed. For example, a first type of memory barrier instruction may commit the memory transactions to a level of a set of cooperating threads that share an L1 (level one) cache. A second type of memory barrier instruction may commit the memory transactions to a level of a set of threads sharing a global memory. Finally, a third type of memory barrier instruction may commit the memory transactions to a system level of all threads sharing all system memories. The latency required to execute the memory barrier instruction varies based on the type of memory barrier instruction.
摘要翻译：本发明的一个实施例提出了一种用于在多个并行线程之间聚合存储器屏障操作的技术。来自给定并行线程处理单元的存储器屏障请求被合并以减少对系统其余部分的影响。此外，存储器屏障请求可以指定针对其提交内存事务的一组线程的级别。例如，第一类型的存储器障碍指令可以将存储器事务提交到共享L1（一级）高速缓存的一组协作线程的级别。第二种类型的存储器障碍指令可以将存储器事务提交到共享全局存储器的一组线程的级别。最后，第三种类型的存储器障碍指令可以将存储器事务提交到共享所有系统存储器的所有线程的系统级。执行存储器屏障指令所需的延迟基于存储器屏障指令的类型而变化。

7. 发明授权

US09639479B2 Instructions for managing a parallel cache hierarchy 有权
公开(公告)号：US09639479B2
公开(公告)日：2017-05-02
申请号：US12888409
申请日：2010-09-22
申请人： John R. Nickolls , Brett W. Coon , Michael C. Shebanow
发明人： John R. Nickolls , Brett W. Coon , Michael C. Shebanow
IPC分类号： G06F12/121 , G06F12/0811 , G06F12/0862 , G06F9/30
CPC分类号： G06F9/3887 , G06F9/30043 , G06F9/3009 , G06F9/3836 , G06F12/0811 , G06F12/0862 , G06F12/0875 , G06F12/0897 , G06F12/121 , G06F2212/452
摘要： A method for managing a parallel cache hierarchy in a processing unit. The method includes receiving an instruction from a scheduler unit, where the instruction comprises a load instruction or a store instruction; determining that the instruction includes a cache operations modifier that identifies a policy for caching data associated with the instruction at one or more levels of the parallel cache hierarchy; and executing the instruction and caching the data associated with the instruction based on the cache operations modifier.

8. 发明授权

US09189242B2 Credit-based streaming multiprocessor warp scheduling 有权
标题翻译：基于信用流的多处理器扭曲调度
公开(公告)号：US09189242B2
公开(公告)日：2015-11-17
申请号：US12885299
申请日：2010-09-17
申请人： John Erik Lindholm , Brett W. Coon , Jered Wierzbicki , Robert J. Stoll , Stuart F. Oberman
发明人： John Erik Lindholm , Brett W. Coon , Jered Wierzbicki , Robert J. Stoll , Stuart F. Oberman
IPC分类号： G06F9/50 , G06F9/38
CPC分类号： G06F9/3851 , G06F9/3836 , G06F9/3885 , G06F9/3887 , G06F9/3889
摘要： One embodiment of the present invention sets forth a technique for ensuring cache access instructions are scheduled for execution in a multi-threaded system to improve cache locality and system performance. A credit-based technique may be used to control instruction by instruction scheduling for each warp in a group so that the group of warps is processed uniformly. A credit is computed for each warp and the credit contributes to a weight for each warp. The weight is used to select instructions for the warps that are issued for execution.
摘要翻译：本发明的一个实施例提出了一种用于确保高速缓存访问指令被调度用于在多线程系统中执行以提高高速缓存位置和系统性能的技术。可以使用基于信用的技术来对组中的每个翘曲的指令调度来控制指令，使得一组经线被均匀地处理。对每个经纱计算信用额度，并且信用额度有助于每个经线的权重。权重用于选择要执行的经纱的说明。

9. 发明授权

US08572355B2 Support for non-local returns in parallel thread SIMD engine 有权
标题翻译：支持并行线程SIMD引擎中的非本地返回
公开(公告)号：US08572355B2
公开(公告)日：2013-10-29
申请号：US12881065
申请日：2010-09-13
申请人： Guillermo Juan Rozas , Brett W. Coon
发明人： Guillermo Juan Rozas , Brett W. Coon
IPC分类号： G06F9/30
CPC分类号： G06F9/30058 , G06F9/3851
摘要： One embodiment of the present invention sets forth a method for executing a non-local return instruction in a parallel thread processor. The method comprises the steps of receiving, within the thread group, a first long jump instruction and, in response, popping a first token from the execution stack. The method also comprises determining whether the first token is a first long jump token that was pushed onto the execution stack when a first push instruction associated with the first long jump instruction was executed, and when the first token is the first long jump token, jumping to the second instruction based on the address specified by the first long jump token, or, when the first token is not the first long jump token, disabling the active thread until the first long jump token is popped from the execution stack.
摘要翻译：本发明的一个实施例提出了一种用于在并行线程处理器中执行非本地返回指令的方法。该方法包括以下步骤：在线程组内接收第一长跳转指令，作为响应，从执行堆栈中弹出第一个令牌。该方法还包括当与第一长跳转指令相关联的第一推送指令被执行时，确定第一令牌是否是被推送到执行堆栈上的第一长跳转令牌，以及当第一令牌是第一长跳转令牌时，跳转基于由第一长跳转令牌指定的地址到第二指令，或者当第一令牌不是第一长跳转令牌时，禁用活动线程，直到从执行堆栈弹出第一个长跳转令牌。

10. 发明授权

US08405665B2 Programmable graphics processor for multithreaded execution of programs 有权
标题翻译：用于多线程执行程序的可编程图形处理器
公开(公告)号：US08405665B2
公开(公告)日：2013-03-26
申请号：US13466043
申请日：2012-05-07
申请人： John Erik Lindholm , Brett W. Coon , Stuart F. Oberman , Ming Y. Siu , Matthew P. Gerlach
发明人： John Erik Lindholm , Brett W. Coon , Stuart F. Oberman , Ming Y. Siu , Matthew P. Gerlach
IPC分类号： G06F15/16 , G06F15/80 , G06F13/14 , G06T1/20
CPC分类号： G06T15/005
摘要： A processing unit includes multiple execution pipelines, each of which is coupled to a first input section for receiving input data for pixel processing and a second input section for receiving input data for vertex processing and to a first output section for storing processed pixel data and a second output section for storing processed vertex data. The processed vertex data is rasterized and scan converted into pixel data that is used as the input data for pixel processing. The processed pixel data is output to a raster analyzer.
摘要翻译：处理单元包括多个执行流水线，每个执行流水线连接到第一输入部分，用于接收用于像素处理的输入数据和用于接收用于顶点处理的输入数据的第二输入部分和用于存储经处理的像素数据的第一输出部分和用于存储经处理的顶点数据的第二输出部分。经处理的顶点数据被光栅化并扫描转换为用作像素处理的输入数据的像素数据。经处理的像素数据被输出到光栅分析器。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式