专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US08640113B2 setjmp/longjmp for speculative execution frameworks 失效
标题翻译： setjmp / longjmp用于推测执行框架
公开(公告)号：US08640113B2
公开(公告)日：2014-01-28
申请号：US13026702
申请日：2011-02-14
申请人： Raul Esteban Silvera , Kai-Ting Amy Wang , Peng Wu , Mark Wayne Yamashita , Xiaotong Zhuang
发明人： Raul Esteban Silvera , Kai-Ting Amy Wang , Peng Wu , Mark Wayne Yamashita , Xiaotong Zhuang
IPC分类号： G06F9/45
CPC分类号： G06F9/3842 , G06F9/3004 , G06F9/30054 , G06F9/30087 , G06F9/4484
摘要： A process for check pointing in speculative execution frameworks, identifies calls to a set of setjmp/longjmp instructions to form identified calls to setjmp/longjmp, determines a control flow path between a call to a setjmp and a longjmp pair of instructions in the identified calls to setjmp/longjmp and replaces calls to the setjmp/longjmp pair of instructions with calls to an improved_setjmp and improved_longjmp instruction pair. The process creates a context data structure in memory, computes a non-volatile save/restore set and replaces the call to improved_setjmp of the setjmp/longjmp pair of instructions with instructions to save all required non-volatile and special purpose registers and replaces a call to improved_longjmp of the setjmp/longjmp pair of instructions with instructions to restore all required non-volatile and special purpose registers and to branch to an instruction immediately following a block of code containing the call to improved_setjmp.
摘要翻译：用于检查指向推测执行框架的过程，识别对一组setjmp / longjmp指令的调用以形成对setjmp / longjmp的标识的调用，确定在所识别的呼叫中对setjmp的调用和longjmp指令之间的控制流路径到setjmp / longjmp，并且通过调用一个improved_setjmp和improved_longjmp指令对来替换对setjmp / longjmp指令对的调用。该过程在内存中创建一个上下文数据结构，计算一个非易失性存储/恢复集，并用setjmp / longjmp指令对来替换一个改进_setjmp的调用，其中包含所有需要的非易失性和特殊用途寄存器的指令，并替换一个调用到具有用于恢复所有需要的非易失性和特殊目的寄存器的指令的setjmp / longjmp指令指令的改进_longjmp，并且分支到紧跟在包含对converted_setjmp的调用的代码块之后的指令。

2. 发明申请

US20110289303A1 SETJMP/LONGJMP FOR SPECULATIVE EXECUTION FRAMEWORKS 失效
标题翻译：用于统一执行框架的SETJMP / LONGJMP
公开(公告)号：US20110289303A1
公开(公告)日：2011-11-24
申请号：US13026702
申请日：2011-02-14
申请人： Raul Esteban Silvera , Kai-Ting Amy Wang , Peng Wu , Mark Wayne Yamashita , Xiaotong Zhuang
发明人： Raul Esteban Silvera , Kai-Ting Amy Wang , Peng Wu , Mark Wayne Yamashita , Xiaotong Zhuang
IPC分类号： G06F9/312
CPC分类号： G06F9/3842 , G06F9/3004 , G06F9/30054 , G06F9/30087 , G06F9/4484
摘要： A process for check pointing in speculative execution frameworks, identifies calls to a set of setjmp/longjmp instructions to form identified calls to setjmp/longjmp, determines a control flow path between a call to a setjmp and a longjmp pair of instructions in the identified calls to setjmp/longjmp and replaces calls to the setjmp/longjmp pair of instructions with calls to an improved_setjmp and improved_longjmp instruction pair. The process creates a context data structure in memory, computes a non-volatile save/restore set and replaces the call to improved_setjmp of the setjmp/longjmp pair of instructions with instructions to save all required non-volatile and special purpose registers and replaces a call to improved_longjmp of the setjmp/longjmp pair of instructions with instructions to restore all required non-volatile and special purpose registers and to branch to an instruction immediately following a block of code containing the call to improved_setjmp.
摘要翻译：用于检查指向推测执行框架的过程，识别对一组setjmp / longjmp指令的调用以形成对setjmp / longjmp的标识的调用，确定在所识别的呼叫中对setjmp的调用和longjmp指令之间的控制流路径到setjmp / longjmp，并且通过调用一个improved_setjmp和improved_longjmp指令对来替换对setjmp / longjmp指令对的调用。该过程在内存中创建一个上下文数据结构，计算一个非易失性存储/恢复集，并用setjmp / longjmp指令对来替换一个改进_setjmp的调用，其中包含所有需要的非易失性和特殊用途寄存器的指令，并替换一个调用到具有用于恢复所有需要的非易失性和特殊目的寄存器的指令的setjmp / longjmp指令指令的改进_longjmp，并且分支到紧跟在包含对converted_setjmp的调用的代码块之后的指令。

3. 发明授权

US08146071B2 Pipelined parallelization of multi-dimensional loops with multiple data dependencies 失效
标题翻译：多维循环与多个数据依赖关系的流水线并行化
公开(公告)号：US08146071B2
公开(公告)日：2012-03-27
申请号：US11857211
申请日：2007-09-18
申请人： Raul Esteban Silvera , Priya Unnikrishnan
发明人： Raul Esteban Silvera , Priya Unnikrishnan
IPC分类号： G06F9/45
CPC分类号： G06F8/4452 , G06F8/443
摘要： A mechanism for folding all the data dependencies in a loop into a single, conservative dependence. This mechanism leads to one pair of synchronization primitives per loop. This mechanism does not require complicated, multi-stage compile time analysis. This mechanism considers only the data dependence information in the loop. The low synchronization cost balances the loss in parallelism due to the reduced overlap between iterations. Additionally, a novel scheme is presented to implement required synchronization to enforce data dependences in a DOACROSS loop. The synchronization is based on an iteration vector, which identifies a spatial position in the iteration space of the loop. Multiple iterations executing in parallel have their own iteration vector for synchronization where they update their position in the iteration space. As no sequential updates to the synchronization variable exist, this method exploits a greater degree of parallelism.
摘要翻译：将循环中的所有数据依赖关系折叠成单个，保守依赖的机制。这种机制导致每个循环一对同步原语。该机制不需要复杂的多阶段编译时分析。该机制只考虑循环中的数据依赖信息。由于迭代之间的重叠减少，低同步成本平衡了并行性的损失。另外，提出了一种新颖的方案来实现所需的同步以在DOACROSS循环中实现数据依赖。同步基于迭代向量，该向量标识循环的迭代空间中的空间位置。并行执行的多个迭代具有自己的迭代向量，用于同步，它们更新其在迭代空间中的位置。由于不存在对同步变量的顺序更新，所以该方法利用更大程度的并行性。

4. 发明申请

US20090158018A1 Method and System for Auto Parallelization of Zero-Trip Loops Through the Induction Variable Substitution 失效
标题翻译：通过感应变量替代自动并联零行程循环的方法和系统
公开(公告)号：US20090158018A1
公开(公告)日：2009-06-18
申请号：US12356978
申请日：2009-01-21
申请人： Zhixing Ren , Raul Esteban Silvera , Guansong Zhang
发明人： Zhixing Ren , Raul Esteban Silvera , Guansong Zhang
IPC分类号： G06F9/44
CPC分类号： G06F8/443 , G06F8/452
摘要： A method and system of auto parallelization of zero-trip loops that substitutes a nested basic linear induction variable by exploiting a parallelizing compiler is provided. Provided is a use of a max{0,N} variable for loop iterations in case of no information is known about the value of N, for a typical loop iterating from 1 to N, in which N is the loop invariant. For the nested basic induction variables, an induction variable substitution process is applied to the nested loops starting from the innermost loop to the outermost one. Then a removal of the max operator afterwards through a copy propagation pass of the IBM compiler is provided. In doing so, the loop dependency on the induction variable is eliminated and an opportunity for a parallelizing compiler to parallel the outermost loop is provided.
摘要翻译：提供了通过利用并行化编译器代替嵌套的基本线性感应变量的零跳行循环自动并行化的方法和系统。提供了对于从1到N迭代的典型循环，在没有关于N的值的信息的情况下，使用max {0，N}变量进行循环迭代，其中N是循环不变量。对于嵌套的基本感应变量，将诱导变量替换过程应用于从最内循环到最外层循环的嵌套循环。然后，通过IBM编译器的复制传播传递，随后删除最大运算符。在这样做时，消除了对感应变量的循环依赖性，并且提供并行化编译器并行最外层循环的机会。

5. 发明申请

US20090106745A1 Method and Apparatus for Optimizing Software Program Using Inter-Procedural Strength Reduction 失效
标题翻译：使用程序间强度降低优化软件程序的方法和装置
公开(公告)号：US20090106745A1
公开(公告)日：2009-04-23
申请号：US12270707
申请日：2008-11-13
申请人： Roch Georges Archambault , Shimin Cui , Raul Esteban Silvera
发明人： Roch Georges Archambault , Shimin Cui , Raul Esteban Silvera
IPC分类号： G06F9/45
CPC分类号： G06F8/443
摘要： Inter-procedural strength reduction is provided by a mechanism of the present invention to optimize software program. During a forward pass, the present invention collects information of global variables and analyzes the information to select candidate computations for optimization. During a backward pass, the present invention replaces costly computations with less costly or weaker computations using pre-computed values and inserts store operations of new global variables to pre-compute the costly computations at definition points of the global variables used in the costly computations.
摘要翻译：通过本发明的机制来优化软件程序来提供程序间强度降低。在正向通过期间，本发明收集全局变量的信息并分析该信息以选择用于优化的候选计算。在反向传递期间，本发明使用预先计算的值替代使用成本较低或较弱计算的昂贵的计算，并插入新的全局变量的存储操作，以在昂贵的计算中使用的全局变量的定义点处预先计算昂贵的计算。

6. 发明授权

US07472382B2 Method for optimizing software program using inter-procedural strength reduction 失效
标题翻译：使用程序间强度降低优化软件程序的方法
公开(公告)号：US07472382B2
公开(公告)日：2008-12-30
申请号：US10930038
申请日：2004-08-30
申请人： Roch Georges Archambault , Shimin Cui , Raul Esteban Silvera
发明人： Roch Georges Archambault , Shimin Cui , Raul Esteban Silvera
IPC分类号： G06F9/45
CPC分类号： G06F8/443
摘要： Inter-procedural strength reduction is provided by a mechanism of the present invention to optimize software program. During a forward pass, the present invention collects information of global variables and analyzes the information to select candidate computations for optimization. During a backward pass, the present invention replaces costly computations with less costly or weaker computations using pre-computed values and inserts store operations of new global variables to pre-compute the costly computations at definition points of the global variables used in the costly computations.
摘要翻译：通过本发明的机制来优化软件程序来提供程序间强度降低。在正向通过期间，本发明收集全局变量的信息并分析该信息以选择用于优化的候选计算。在反向传递期间，本发明使用预先计算的值替代使用成本较低或较弱计算的昂贵的计算，并插入新的全局变量的存储操作，以在昂贵的计算中使用的全局变量的定义点处预先计算昂贵的计算。

7. 发明授权

US09038045B2 Unified parallel C work-sharing loop construct transformation 有权
标题翻译：统一并行C工作共享循环构造转换
公开(公告)号：US09038045B2
公开(公告)日：2015-05-19
申请号：US13296705
申请日：2011-11-15
申请人： Yaoqing Gao , Liangxiao Hu , Raul Esteban Silvera , Ettore Tiotto
发明人： Yaoqing Gao , Liangxiao Hu , Raul Esteban Silvera , Ettore Tiotto
IPC分类号： G06F9/45 , G06F9/44
CPC分类号： G06F8/51 , G06F8/314
摘要： Control flow information and data flow information associated with a program containing a upc_forall loop are built. A shared reference map data structure using the control flow information and the data flow information is created. All local shared accesses are hashed to facilitate a constant access stride after being rewritten. All local shared references in a hash entry having a longest list are privatized. The upc_forall loop is rewritten into a for loop. Responsive to a determination that an unprocessed upc_forall loop does not exist, dead store elimination is run. The control flow information and the data flow information associated with the program containing the for loop is rebuilt.
摘要翻译：构建与包含upc_forall循环的程序相关联的控制流信息和数据流信息。创建使用控制流信息和数据流信息的共享参考地图数据结构。所有本地共享访问都被散列，以便在重写后能够持续访问。具有最长列表的哈希条目中的所有本地共享引用被私有化。 upc_forall循环被重写为for循环。响应于确定未处理的upc_forall循环不存在，执行死区消除。与包含for循环的程序相关联的控制流信息和数据流信息被重建。

8. 发明授权

US08484630B2 Code motion based on live ranges in an optimizing compiler 失效
标题翻译：基于优化编译器中的生存范围的代码运动
公开(公告)号：US08484630B2
公开(公告)日：2013-07-09
申请号：US12343228
申请日：2008-12-23
申请人： Shimin Cui , Raul Esteban Silvera
发明人： Shimin Cui , Raul Esteban Silvera
IPC分类号： G06F9/45
CPC分类号： G06F8/443
摘要： Optimizing program code in a static compiler by determining the live ranges of variables and determining which live ranges are candidates for moving code from the use site to the definition site of source code. Live ranges for variables in a flow graph are determined. Selected live ranges are determined as candidates in which code will be moved from a use site within the source code to a definition site within the source code. Optimization opportunities within the source code are identified based on the code motion.
摘要翻译：通过确定变量的生存范围并确定哪些生存范围是将代码从使用站点移动到源代码定义位置的候选者来优化静态编译器中的程序代码。确定流程图中变量的活动范围。选择的实时范围被确定为将代码从源代码中的使用站点移动到源代码中的定义站点的候选者。基于代码运动来识别源代码中的优化机会。

9. 发明授权

US08104030B2 Mechanism to restrict parallelization of loops 失效
标题翻译：限制环路并行化的机制
公开(公告)号：US08104030B2
公开(公告)日：2012-01-24
申请号：US11314456
申请日：2005-12-21
申请人： Raul Esteban Silvera , Priya Unnikrishnan , Guansong Zhang
发明人： Raul Esteban Silvera , Priya Unnikrishnan , Guansong Zhang
IPC分类号： G06F9/44 , G06F9/45
CPC分类号： G06F8/4452
摘要： A computer implemented method, computer usable program code, and a system for parallelizing a loop. A parameter that will be used to limit parallelization of the loop is identified to limit parallelization of the loop. The parameter specifies a minimum number of loop iterations that a thread should execute. The parameter can be adjusted based on a parallel performance factor. A parallel performance factor is a factor that influences the performance of parallel code. A number of threads from a plurality of threads is selected for processing iterations of the loop based on the parameter. The number of threads is selected prior to execution of the first iteration of the loop.
摘要翻译：计算机实现的方法，计算机可用程序代码和用于并行化循环的系统。确定用于限制环路并联的参数，以限制环路的并行化。该参数指定线程应执行的最小循环迭代次数。该参数可以根据并行性能因素进行调整。并行性能因素是影响并行代码性能的因素。选择来自多个线程的多个线程用于基于该参数来处理该循环的迭代。在执行循环的第一次迭代之前选择线程数。

10. 发明申请

US20090077545A1 PIPELINED PARALLELIZATION OF MULTI-DIMENSIONAL LOOPS WITH MULTIPLE DATA DEPENDENCIES 失效
标题翻译：具有多个数据依赖关系的多维数据管道并行化
公开(公告)号：US20090077545A1
公开(公告)日：2009-03-19
申请号：US11857211
申请日：2007-09-18
申请人： Raul Esteban Silvera , Priya Unnikrishnan
发明人： Raul Esteban Silvera , Priya Unnikrishnan
IPC分类号： G06F9/45
CPC分类号： G06F8/4452 , G06F8/443
摘要： A mechanism for folding all the data dependencies in a loop into a single, conservative dependence. This mechanism leads to one pair of synchronization primitives per loop. This mechanism does not require complicated, multi-stage compile time analysis. This mechanism considers only the data dependence information in the loop. The low synchronization cost balances the loss in parallelism due to the reduced overlap between iterations. Additionally, a novel scheme is presented to implement required synchronization to enforce data dependences in a DOACROSS loop. The synchronization is based on an iteration vector, which identifies a spatial position in the iteration space of the loop. Multiple iterations executing in parallel have their own iteration vector for synchronization where they update their position in the iteration space. As no sequential updates to the synchronization variable exist, this method exploits a greater degree of parallelism.
摘要翻译：将循环中的所有数据依赖关系折叠成单个，保守依赖的机制。这种机制导致每个循环一对同步原语。该机制不需要复杂的多阶段编译时分析。该机制只考虑循环中的数据依赖信息。由于迭代之间的重叠减少，低同步成本平衡了并行性的损失。另外，提出了一种新颖的方案来实现所需的同步以在DOACROSS循环中实现数据依赖。同步基于迭代向量，该向量标识循环的迭代空间中的空间位置。并行执行的多个迭代具有自己的迭代向量，用于同步，它们更新其在迭代空间中的位置。由于不存在对同步变量的顺序更新，所以该方法利用更大程度的并行性。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式