专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

11. 发明授权

US08468531B2 Method and apparatus for efficient inter-thread synchronization for helper threads 有权
标题翻译：帮助线程有效的线程间同步的方法和设备
公开(公告)号：US08468531B2
公开(公告)日：2013-06-18
申请号：US12787810
申请日：2010-05-26
申请人： Michael K. Gschwind , John K. O'Brien , Valentina Salapura , Zehra N. Sura
发明人： Michael K. Gschwind , John K. O'Brien , Valentina Salapura , Zehra N. Sura
IPC分类号： G06F9/46 , G06F12/08
CPC分类号： G06F9/30087 , G06F9/383 , G06F9/3834 , G06F9/3842 , G06F9/3851 , G06F9/4881 , G06F9/5022 , G06F9/52 , G06F9/542
摘要： A monitor bit per hardware thread in a memory location may be allocated, in a multiprocessing computer system having a plurality of hardware threads, the plurality of hardware threads sharing the memory location, and each of the allocated monitor bit corresponding to one of the plurality of hardware threads. A condition bit may be allocated for each of the plurality of hardware threads, the condition bit being allocated in each context of the plurality of hardware threads. In response to detecting the memory location being accessed, it is determined whether a monitor bit corresponding to a hardware thread in the memory location is set. In response to determining that the monitor bit corresponding to a hardware thread is set in the memory location, a condition bit corresponding to a thread accessing the memory location is set in the hardware thread's context.
摘要翻译：可以在具有多个硬件线程的多处理计算机系统中分配存储器位置中的每个硬件线程的监视器位，所述多个硬件线程共享存储器位置，并且所分配的监视器位中的每一个对应于多个硬件线程。可以为多个硬件线程中的每一个分配条件位，该条件位在多个硬件线程的每个上下文中被分配。响应于检测到被访问的存储器位置，确定是否设置了与存储器位置中的硬件线程相对应的监视位。响应于确定对应于硬件线程的监视位设置在存储器位置中，在硬件线程的上下文中设置与访问存储位置的线程相对应的条件位。

12. 发明授权

US08453161B2 Method and apparatus for efficient helper thread state initialization using inter-thread register copy 有权
标题翻译：使用线程间寄存器复制的有效帮助线程状态初始化的方法和装置
公开(公告)号：US08453161B2
公开(公告)日：2013-05-28
申请号：US12787128
申请日：2010-05-25
申请人： Michael K. Gschwind , John K. O'Brien , Valentina Salapura , Zehra N. Sura
发明人： Michael K. Gschwind , John K. O'Brien , Valentina Salapura , Zehra N. Sura
IPC分类号： G06F13/00 , G06F12/00 , G06F9/30
CPC分类号： G06F9/544
摘要： This disclosure describes a method and system that may enable fast, hardware-assisted, producer-consumer style communication of values between threads. The method, in one aspect, uses a dedicated hardware buffer as an intermediary storage for transferring values from registers in one thread to registers in another thread. The method may provide a generic, programmable solution that can transfer any subset of register values between threads in any given order, where the source and target registers may or may not be correlated. The method also may allow for determinate access times, since it completely bypasses the memory hierarchy. Also, the method is designed to be lightweight, focusing on communication, and keeping synchronization facilities orthogonal to the communication mechanism. It may be used by a helper thread that performs data prefetching for an application thread, for example, to initialize the upward-exposed reads in the address computation slice of the helper thread code.
摘要翻译：本公开描述了一种方法和系统，其可以实现线程之间的值的快速，硬件辅助，生产者 - 消费者风格的通信。该方法在一个方面中使用专用硬件缓冲器作为用于将值从一个线程中的寄存器传送到另一线程中的寄存器的中间存储器。该方法可以提供通用的可编程解决方案，其可以以任何给定的顺序在线程之间传送寄存器值的任何子集，其中源寄存器和目标寄存器可以或可以不相关。该方法还可以允许确定的访问时间，因为它完全绕过存储器层次结构。此外，该方法被设计为轻量级，专注于通信，并保持与通信机制正交的同步设备。它可以由对应用程序线程执行数据预取的辅助线程使用，例如，初始化辅助线程代码的地址计算切片中的向上暴露的读取。

13. 发明申请

US20110088020A1 PARALLELIZATION OF IRREGULAR REDUCTIONS VIA PARALLEL BUILDING AND EXPLOITATION OF CONFLICT-FREE UNITS OF WORK AT RUNTIME 失效
标题翻译：通过平行建筑和平稳利用无冲突的工作单位在运行期间的平行化
公开(公告)号：US20110088020A1
公开(公告)日：2011-04-14
申请号：US12576717
申请日：2009-10-09
申请人： Alexandre E. Eichenberger , Yangchun Luo , John K. O'Brien , Xiaotong Zhuang
发明人： Alexandre E. Eichenberger , Yangchun Luo , John K. O'Brien , Xiaotong Zhuang
IPC分类号： G06F9/45
CPC分类号： G06F8/456
摘要： An optimizing compiler device, a method, a computer program product which are capable of performing parallelization of irregular reductions. The method for performing parallelization of irregular reductions includes receiving, at a compiler, a program and selecting, at compile time, at least one unit of work (UW) from the program, each UW configured to operate on at least one reduction operation, where at least one reduction operation in the UW operates on a reduction variable whose address is determinable when running the program at a run-time. At run time, for each successive current UW, a list of reduction operations accessed by that unit of work is recorded. Further, it is determined at run time whether reduction operations accessed by a current UW conflict with any reduction operations recorded as having been accessed by prior selected units of work, and assigning the unit of work as a conflict free unit of work (CFUW) when no conflicts are found. Finally, there is scheduled, for parallel run-time operation, at least two or more processing threads to process a respective the at least two or more assigned CFUWs.
摘要翻译：优化编译器装置，方法，计算机程序产品，其能够执行不规则减少的并行化。用于执行不规则减少的并行化的方法包括在编译器处接收程序并且在编译时选择来自程序的至少一个工作单元（UW），每个UW被配置为在至少一个简化操作上操作，其中 UW中的至少一个减少操作对于在运行时运行程序时地址是可确定的减法变量进行操作。在运行时，对于每个连续的当前UW，记录由该工作单元访问的减少操作的列表。此外，在运行时确定由目前的UW访问的减少操作是否与任何记录为由先前选择的工作单元访问的任何缩减操作相冲突，并且将工作单元分配为无冲突的工作单元（CFUW），当没有发现冲突。最后，对于并行运行时间操作，计划至少两个或更多个处理线程来处理相应的所述至少两个或更多个分配的CFUW。

14. 发明申请

US20100023932A1 Efficient Software Cache Accessing With Handle Reuse 有权
标题翻译：有效的软件缓存访问与手柄重用
公开(公告)号：US20100023932A1
公开(公告)日：2010-01-28
申请号：US12177543
申请日：2008-07-22
申请人： Alexandre E. Eichenberger , Marc Gonzalez Tallada , John K. O'Brien
发明人： Alexandre E. Eichenberger , Marc Gonzalez Tallada , John K. O'Brien
IPC分类号： G06F9/45
CPC分类号： G06F8/4442
摘要： A mechanism for efficient software cache accessing with handle reuse is provided. The mechanism groups references in source code into a reference stream with the reference stream having a size equal to or less than a size of a software cache line. The source code is transformed into optimized code by modifying the source code to include code for performing at most two cache lookup operations for the reference stream to obtain two cache line handles. Moreover, the transformation involves inserting code to resolve references in the reference stream based on the two cache line handles. The optimized code may be output for generation of executable code.
摘要翻译：提供了一种用于具有句柄重用的高效软件高速缓存访问的机制。该机制将源代码中的引用分组为具有等于或小于软件高速缓存行的大小的参考流的参考流。源代码通过修改源代码来转换成优化的代码，以包括为参考流执行至多两个高速缓存查找操作的代码，以获得两个高速缓存行句柄。此外，转换涉及插入代码以基于两个高速缓存行句柄来解析引用流中的引用。可以输出优化的代码以生成可执行代码。

15. 发明授权

US08819651B2 Efficient software cache accessing with handle reuse 有权
标题翻译：高效的软件缓存访问与句柄重用
公开(公告)号：US08819651B2
公开(公告)日：2014-08-26
申请号：US12177543
申请日：2008-07-22
申请人： Alexandre E. Eichenberger , Marc Gonzalez Tallada , John K. O'Brien
发明人： Alexandre E. Eichenberger , Marc Gonzalez Tallada , John K. O'Brien
IPC分类号： G06F9/45
CPC分类号： G06F8/4442
摘要： A mechanism for efficient software cache accessing with handle reuse is provided. The mechanism groups references in source code into a reference stream with the reference stream having a size equal to or less than a size of a software cache line. The source code is transformed into optimized code by modifying the source code to include code for performing at most two cache lookup operations for the reference stream to obtain two cache line handles. Moreover, the transformation involves inserting code to resolve references in the reference stream based on the two cache line handles. The optimized code may be output for generation of executable code.
摘要翻译：提供了一种用于具有句柄重用的高效软件高速缓存访问的机制。该机制将源代码中的引用分组为具有等于或小于软件高速缓存行的大小的参考流的参考流。源代码通过修改源代码来转换成优化的代码，以包括为参考流执行至多两个高速缓存查找操作的代码，以获得两个高速缓存行句柄。此外，转换涉及插入代码以基于两个高速缓存行句柄来解析引用流中的引用。可以输出优化的代码以生成可执行代码。

16. 发明授权

US08468508B2 Parallelization of irregular reductions via parallel building and exploitation of conflict-free units of work at runtime 失效
标题翻译：通过并行建设和运行时无冲突的工作单位利用不平等减少并行化
公开(公告)号：US08468508B2
公开(公告)日：2013-06-18
申请号：US12576717
申请日：2009-10-09
申请人： Alexandre E. Eichenberger , Yangchun Luo , John K. O'Brien , Xiaotong Zhuang
发明人： Alexandre E. Eichenberger , Yangchun Luo , John K. O'Brien , Xiaotong Zhuang
IPC分类号： G06F9/45
CPC分类号： G06F8/456
摘要： An optimizing compiler device, a method, a computer program product which are capable of performing parallelization of irregular reductions. The method for performing parallelization of irregular reductions includes receiving, at a compiler, a program and selecting, at compile time, at least one unit of work (UW) from the program, each UW configured to operate on at least one reduction operation, where at least one reduction operation in the UW operates on a reduction variable whose address is determinable when running the program at a run-time. At run time, for each successive current UW, a list of reduction operations accessed by that unit of work is recorded. Further, it is determined at run time whether reduction operations accessed by a current UW conflict with any reduction operations recorded as having been accessed by prior selected units of work, and assigning the unit of work as a conflict free unit of work (CFUW) when no conflicts are found. Finally, there is scheduled, for parallel run-time operation, at least two or more processing threads to process a respective the at least two or more assigned CFUWs.
摘要翻译：优化编译器装置，方法，计算机程序产品，其能够执行不规则减少的并行化。用于执行不规则减少的并行化的方法包括在编译器处接收程序并且在编译时选择来自程序的至少一个工作单元（UW），每个UW被配置为在至少一个简化操作上操作，其中 UW中的至少一个减少操作对于在运行时运行程序时地址是可确定的减法变量进行操作。在运行时，对于每个连续的当前UW，记录由该工作单元访问的减少操作的列表。此外，在运行时确定由目前的UW访问的减少操作是否与任何记录为由先前选择的工作单元访问的任何缩减操作相冲突，并且将工作单元分配为无冲突的工作单元（CFUW），当没有发现冲突。最后，对于并行运行时间操作，计划至少两个或更多个处理线程来处理相应的所述至少两个或更多个分配的CFUW。

17. 发明申请

US20100023700A1 Dynamically Maintaining Coherency Within Live Ranges of Direct Buffers 失效
标题翻译：在直接缓冲区的实际范围内动态维护一致性
公开(公告)号：US20100023700A1
公开(公告)日：2010-01-28
申请号：US12177507
申请日：2008-07-22
申请人： Tong Chen , John K. O'Brien , Tao Zhang
发明人： Tong Chen , John K. O'Brien , Tao Zhang
IPC分类号： G06F9/45 , G06F12/08 , G06F12/02
CPC分类号： G06F8/4442
摘要： Reducing coherency problems in a data processing system is provided. Source code that is to be compiled is received and analyzed to identify at least one of a plurality of loops that contain a memory reference. A determination is made as to whether the memory reference is an access to a global memory that should be handled by a direct buffer. Responsive to an indication that the memory reference is an access to the global memory that should be handled by the direct buffer, the memory reference is marked for direct buffer transformation. The direct buffer transformation is then applied to the memory reference.
摘要翻译：提供了减少数据处理系统中的一致性问题。要编译的源代码被接收和分析以识别包含存储器引用的多个循环中的至少一个循环。确定存储器引用是否是应该由直接缓冲器处理的全局存储器的访问。响应于指示存储器引用是对由直接缓冲器处理的全局存储器的访问，存储器引用被标记用于直接缓冲器转换。然后将直接缓冲区变换应用于存储器引用。

18. 发明授权

US08776034B2 Dynamically maintaining coherency within live ranges of direct buffers 失效
标题翻译：在直接缓冲区的生存范围内动态维护一致性
公开(公告)号：US08776034B2
公开(公告)日：2014-07-08
申请号：US13584356
申请日：2012-08-13
申请人： Tong Chen , John K. O'Brien , Tao Zhang
发明人： Tong Chen , John K. O'Brien , Tao Zhang
IPC分类号： G06F9/45 , G06T13/00 , H04L12/50
CPC分类号： G06F8/4442
摘要： Reducing coherency problems in a data processing system is provided. Source code that is to be compiled is received and analyzed to identify at least one of a plurality of loops that contain a memory reference. A determination is made as to whether the memory reference is an access to a global memory that should be handled by a direct buffer. Responsive to an indication that the memory reference is an access to the global memory that should be handled by the direct buffer, the memory reference is marked for direct buffer transformation. The direct buffer transformation is then applied to the memory reference.
摘要翻译：提供了减少数据处理系统中的一致性问题。要编译的源代码被接收和分析以识别包含存储器引用的多个循环中的至少一个循环。确定存储器引用是否是应该由直接缓冲器处理的全局存储器的访问。响应于指示存储器引用是对由直接缓冲器处理的全局存储器的访问，存储器引用被标记用于直接缓冲器转换。然后将直接缓冲区变换应用于存储器引用。

19. 发明授权

US08285670B2 Dynamically maintaining coherency within live ranges of direct buffers 失效
标题翻译：在直接缓冲区的生存范围内动态维护一致性
公开(公告)号：US08285670B2
公开(公告)日：2012-10-09
申请号：US12177507
申请日：2008-07-22
申请人： Tong Chen , John K. O'Brien , Tao Zhang
发明人： Tong Chen , John K. O'Brien , Tao Zhang
IPC分类号： G06F7/00
CPC分类号： G06F8/4442
摘要： Reducing coherency problems in a data processing system is provided. Source code that is to be compiled is received and analyzed to identify at least one of a plurality of loops that contain a memory reference. A determination is made as to whether the memory reference is an access to a global memory that should be handled by a direct buffer. Responsive to an indication that the memory reference is an access to the global memory that should be handled by the direct buffer, the memory reference is marked for direct buffer transformation. The direct buffer transformation is then applied to the memory reference.
摘要翻译：提供了减少数据处理系统中的一致性问题。要编译的源代码被接收和分析以识别包含存储器引用的多个循环中的至少一个循环。确定存储器引用是否是应该由直接缓冲器处理的全局存储器的访问。响应于指示存储器引用是对由直接缓冲器处理的全局存储器的访问，存储器引用被标记用于直接缓冲器转换。然后将直接缓冲区变换应用于存储器引用。

20. 发明授权

US08281295B2 Computer analysis and runtime coherency checking 失效
标题翻译：计算机分析和运行时一致性检查
公开(公告)号：US08281295B2
公开(公告)日：2012-10-02
申请号：US12125982
申请日：2008-05-23
申请人： Tong Chen , Haibo Lin , John K. O'Brien , Tao Zhang
发明人： Tong Chen , Haibo Lin , John K. O'Brien , Tao Zhang
IPC分类号： G06F9/45
CPC分类号： G06F8/433
摘要： Compiler analysis and runtime coherency checking for reducing coherency problems is provided. Source code is analyzed to identify at least one of a plurality of loops that contains a memory reference. A determination is made as to whether the memory reference is an access to a global memory that should be handled by at least one of a software controlled cache or a direct buffer. A determination is made as to whether there is a data dependence between the memory reference and at least one reference from at least one of other direct buffers or other software controlled caches in response to an indication that the memory reference is an access to the global memory that should be handled by either the software controlled cache or the direct buffer. A direct buffer transformation is applied to the memory reference in response to a negative indication of the data dependence.
摘要翻译：提供了编译器分析和运行时相关性检查，以减少相关性问题。分析源代码以识别包含存储器引用的多个循环中的至少一个。确定存储器引用是否是对由软件控制的高速缓存或直接缓冲器中的至少一个来处理的全局存储器的访问。确定响应于存储器引用是对全局存储器的访问的指示，确定存储器引用与来自其他直接缓冲器或其他软件控制的高速缓存中的至少一个的至少一个引用之间是否存在数据依赖性应由软件控制的缓存或直接缓冲区来处理。响应于数据依赖性的负指示，将直接缓冲器变换应用于存储器引用。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式