专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

51. 发明申请

US20060190700A1 Handling permanent and transient errors using a SIMD unit 审中-公开
标题翻译：使用SIMD单元处理永久和瞬态错误
公开(公告)号：US20060190700A1
公开(公告)日：2006-08-24
申请号：US11063122
申请日：2005-02-22
申请人： Erik Altman , Gheorghe Cascaval , Luis Ceze , Vijayalakshmi Srinivasan
发明人： Erik Altman , Gheorghe Cascaval , Luis Ceze , Vijayalakshmi Srinivasan
IPC分类号： G06F15/00
CPC分类号： G06F11/1641
摘要： A method for handling permanent and transient errors in a microprocessor is disclosed. The method includes reading a scalar value and a scalar operation from an execution unit of the microprocessor. The method further includes writing a copy of the scalar value into each of a plurality of elements of a vector register of a Single Instruction Multiple Data (SIMD) unit of the microprocessor and executing the scalar operation on each scalar value in each of the plurality of elements of the vector register of the SIMED unit using a vector operation. The method further includes comparing each result of the scalar operation on each scalar value in each of the plurality of elements of the vector register and detecting a permanent or transient error if all of the results are not identical.
摘要翻译：公开了一种用于处理微处理器中的永久和瞬态误差的方法。该方法包括从微处理器的执行单元读取标量值和标量运算。该方法还包括将标量值的副本写入微处理器的单指令多数据（SIMD）单元的向量寄存器的多个元素中的每一个元素中，并对多个数据单元中的每一个的每个标量值执行标量运算使用向量操作的SIMED单元的向量寄存器的元素。所述方法还包括：对所述向量寄存器的所述多个元素中的每一个中的每个标量值进行标量运算的每个结果，如果所有结果不相同，则检测永久或瞬态错误。

52. 发明申请

US20060155933A1 Cost-conscious pre-emptive cache line displacement and relocation mechanisms 失效
标题翻译：具有成本意识的优先级缓存行位移和迁移机制
公开(公告)号：US20060155933A1
公开(公告)日：2006-07-13
申请号：US11035350
申请日：2005-01-13
申请人： Alper Buyuktosunoglu , Zhigang Hu , Jude Rivers , John Robinson , Xiaowei Shen , Vijayalakshmi Srinivasan
发明人： Alper Buyuktosunoglu , Zhigang Hu , Jude Rivers , John Robinson , Xiaowei Shen , Vijayalakshmi Srinivasan
IPC分类号： G06F12/00
CPC分类号： G06F12/0897 , G06F12/128 , Y02D10/13
摘要： A hardware based method for determining when to migrate cache lines to the cache bank closest to the requesting processor to avoid remote access penalty for future requests. In a preferred embodiment, decay counters are enhanced and used in determining the cost of retaining a line as opposed to replacing it while not losing the data. In one embodiment, a minimization of off-chip communication is sought; this may be particularly useful in a CMP environment.
摘要翻译：一种基于硬件的方法，用于确定何时将高速缓存行迁移到最靠近请求处理器的缓存库，以避免对未来请求的远程访问损失。在优选实施例中，衰减计数器被增强并用于确定保留线的成本，而不是在不丢失数据的情况下替换它。在一个实施例中，寻求片外通信的最小化; 这可能在CMP环境中特别有用。

53. 发明授权

US09069545B2 Relaxation of synchronization for iterative convergent computations 有权
标题翻译：放松迭代收敛计算的同步
公开(公告)号：US09069545B2
公开(公告)日：2015-06-30
申请号：US13184718
申请日：2011-07-18
申请人： Lakshminarayanan Renganarayana , Vijayalakshmi Srinivasan
发明人： Lakshminarayanan Renganarayana , Vijayalakshmi Srinivasan
IPC分类号： G06F9/44 , G06F9/30 , G06F9/52 , G06F9/45
CPC分类号： G06F9/3004 , G06F8/458 , G06F9/30087 , G06F9/30185 , G06F9/52
摘要： Systems and methods are disclosed that allow atomic updates to global data to be at least partially eliminated to reduce synchronization overhead in parallel computing. A compiler analyzes the data to be processed to selectively permit unsynchronized data transfer for at least one type of data. A programmer may provide a hint to expressly identify the type of data that are candidates for unsynchronized data transfer. In one embodiment, the synchronization overhead is reducible by generating an application program that selectively substitutes codes for unsynchronized data transfer for a subset of codes for synchronized data transfer. In another embodiment, the synchronization overhead is reducible by employing a combination of software and hardware by using relaxation data registers and decoders that collectively convert a subset of commands for synchronized data transfer into commands for unsynchronized data transfer.
摘要翻译：公开了允许至少部分地消除全局数据的原子更新以减少并行计算中的同步开销的系统和方法。编译器分析要处理的数据，以选择性地允许至少一种类型的数据的不同步数据传输。程序员可以提供明确识别作为不同步数据传输候选的数据类型的提示。在一个实施例中，可以通过生成一个应用程序来减少同步开销，所述应用程序选择性地替代用于同步数据传输的代码子集的非同步数据传输的代码。在另一个实施例中，可以通过使用松弛数据寄存器和解码器来将软件和硬件的组合应用于将用于同步数据传输的命令的子集合转换成用于非同步数据传输的命令来减少同步开销。

54. 发明申请

US20140019689A1 METHODS OF CACHE PRELOADING ON A PARTITION OR A CONTEXT SWITCH 有权
标题翻译：高速缓存在分段或上下文开关上的方法
公开(公告)号：US20140019689A1
公开(公告)日：2014-01-16
申请号：US13545304
申请日：2012-07-10
申请人： Harold W. Cain, III , Vijayalakshmi Srinivasan , Jason Zebchuk
发明人： Harold W. Cain, III , Vijayalakshmi Srinivasan , Jason Zebchuk
IPC分类号： G06F12/12
CPC分类号： G06F12/0862 , G06F12/0813 , G06F12/0866 , G06F12/1009 , G06F12/122 , G06F12/123 , G06F2212/602 , G06F2212/69
摘要： A scheme referred to as a “Region-based cache restoration prefetcher” (RECAP) is employed for cache preloading on a partition or a context switch. The RECAP exploits spatial locality to provide a bandwidth-efficient prefetcher to reduce the “cold” cache effect caused by multiprogrammed virtualization. The RECAP groups cache blocks into coarse-grain regions of memory, and predicts which regions contain useful blocks that should be prefetched the next time the current virtual machine executes. Based on these predictions, and using a simple compression technique that also exploits spatial locality, the RECAP provides a robust prefetcher that improves performance without excessive bandwidth overhead or slowdown.
摘要翻译：被称为“基于区域的高速缓存恢复预取器”（RECAP）的方案被用于在分区或上下文切换上进行高速缓存预加载。 RECAP利用空间局部性提供带宽有效的预取器，以减少由多编程虚拟化引起的“冷”缓存效应。 RECAP组将高速缓存块缓存到内存的粗粒度区域中，并预测哪些区域包含下一次执行当前虚拟机时应预取的有用块。基于这些预测，并且使用也利用空间局部性的简单压缩技术，RECAP提供了一种强大的预取器，可以在没有过多带宽开销或减速的情况下提高性能。

55. 发明授权

US08516197B2 Write-through cache optimized for dependence-free parallel regions 有权
标题翻译：针对无依赖并行区域优化的直写缓存
公开(公告)号：US08516197B2
公开(公告)日：2013-08-20
申请号：US13025706
申请日：2011-02-11
申请人： Alexandre E. Eichenberger , Alan G. Gara , Martin Ohmacht , Vijayalakshmi Srinivasan
发明人： Alexandre E. Eichenberger , Alan G. Gara , Martin Ohmacht , Vijayalakshmi Srinivasan
IPC分类号： G06F12/00
CPC分类号： G06F12/0837
摘要： An apparatus, method and computer program product for improving performance of a parallel computing system. A first hardware local cache controller associated with a first local cache memory device of a first processor detects an occurrence of a false sharing of a first cache line by a second processor running the program code and allows the false sharing of the first cache line by the second processor. The false sharing of the first cache line occurs upon updating a first portion of the first cache line in the first local cache memory device by the first hardware local cache controller and subsequent updating a second portion of the first cache line in a second local cache memory device by a second hardware local cache controller.
摘要翻译：一种用于提高并行计算系统性能的装置，方法和计算机程序产品。与第一处理器的第一本地高速缓冲存储器设备相关联的第一硬件本地高速缓存控制器通过运行程序代码的第二处理器检测出第一高速缓存行的虚假共享的发生，并允许第一高速缓存行的错误共享由第二处理器。当由第一硬件本地高速缓存控制器更新第一本地高速缓存存储器设备中的第一高速缓存行的第一部分并且随后在第二本地高速缓冲存储器中更新第一高速缓存行的第二部分时，发生第一高速缓存行的错误共享设备由第二硬件本地缓存控制器。

56. 发明申请

US20130205118A1 MULTI-THREADED PROCESSOR INSTRUCTION BALANCING THROUGH INSTRUCTION UNCERTAINTY 有权
标题翻译：多线程处理器通过指令不确定的平衡
公开(公告)号：US20130205118A1
公开(公告)日：2013-08-08
申请号：US13366999
申请日：2012-02-06
申请人： Alper Buyuktosunoglu , Brian R. Prasky , Vijayalakshmi Srinivasan
发明人： Alper Buyuktosunoglu , Brian R. Prasky , Vijayalakshmi Srinivasan
IPC分类号： G06F9/30 , G06F9/38
CPC分类号： G06F9/3844 , G06F9/3851
摘要： A computer system for instruction execution includes a processor having a pipeline. The system is configured to perform a method including fetching, in the pipeline, a plurality of instructions, wherein the plurality of instructions includes a plurality of branch instructions, for each of the plurality of branch instructions, assigning a branch uncertainty to each of the plurality of branch instructions, for each of the plurality of instructions, assigning an instruction uncertainty that is a summation of branch uncertainties of older unresolved branches and balancing the instructions, based on a current summation of instruction uncertainty, in the pipeline.
摘要翻译：用于指令执行的计算机系统包括具有流水线的处理器。该系统被配置为执行一种方法，包括在流水线中取出多个指令，其中多个指令包括多个分支指令，对于多个分支指令中的每一个，向多个指令中的每一个指派分支不确定度对于所述多个指令中的每一个指令，分配指示不确定性，所述指令不确定性是基于所述流水线中的指令不确定性的当前求和而作为旧的未解析分支的分支不确定性的总和并且平衡所述指令。

57. 发明申请

US20130024662A1 RELAXATION OF SYNCHRONIZATION FOR ITERATIVE CONVERGENT COMPUTATIONS 有权
标题翻译：用于迭代融合计算的同步放松
公开(公告)号：US20130024662A1
公开(公告)日：2013-01-24
申请号：US13184718
申请日：2011-07-18
申请人： Lakshminarayanan Renganarayana , Vijayalakshmi Srinivasan
发明人： Lakshminarayanan Renganarayana , Vijayalakshmi Srinivasan
IPC分类号： G06F9/30
CPC分类号： G06F9/3004 , G06F8/458 , G06F9/30087 , G06F9/30185 , G06F9/52
摘要： Systems and methods are disclosed that allow atomic updates to global data to be at least partially eliminated to reduce synchronization overhead in parallel computing. A compiler analyzes the data to be processed to selectively permit unsynchronized data transfer for at least one type of data. A programmer may provide a hint to expressly identify the type of data that are candidates for unsynchronized data transfer. In one embodiment, the synchronization overhead is reducible by generating an application program that selectively substitutes codes for unsynchronized data transfer for a subset of codes for synchronized data transfer. In another embodiment, the synchronization overhead is reducible by employing a combination of software and hardware by using relaxation data registers and decoders that collectively convert a subset of commands for synchronized data transfer into commands for unsynchronized data transfer.
摘要翻译：公开了允许至少部分地消除全局数据的原子更新以减少并行计算中的同步开销的系统和方法。编译器分析要处理的数据，以选择性地允许至少一种类型的数据的不同步数据传输。程序员可以提供明确识别作为不同步数据传输候选的数据类型的提示。在一个实施例中，可以通过生成一个应用程序来减少同步开销，所述应用程序选择性地替代用于同步数据传输的代码子集的非同步数据传输的代码。在另一个实施例中，可以通过使用松弛数据寄存器和解码器来将软件和硬件的组合应用于将用于同步数据传输的命令的子集合转换成用于非同步数据传输的命令来减少同步开销。

58. 发明申请

US20120284463A1 PREDICTING CACHE MISSES USING DATA ACCESS BEHAVIOR AND INSTRUCTION ADDRESS 有权
标题翻译：使用数据访问行为和指令地址预测高速缓存错误
公开(公告)号：US20120284463A1
公开(公告)日：2012-11-08
申请号：US13099178
申请日：2011-05-02
申请人： Vijayalakshmi Srinivasan , Brian R. Prasky
发明人： Vijayalakshmi Srinivasan , Brian R. Prasky
IPC分类号： G06F12/08
CPC分类号： G06F9/383 , G06F9/3832 , G06F9/3836 , G06F9/3844 , G06F9/3851
摘要： In a decode stage of hardware processor pipeline, one particular instruction of a plurality of instructions is decoded. It is determined that the particular instruction requires a memory access. Responsive to such determination, it is predicted whether the memory access will result in a cache miss. The predicting in turn includes accessing one of a plurality of entries in a pattern history table stored as a hardware table in the decode stage. The accessing is based, at least in part, upon at least a most recent entry in a global history buffer. The pattern history table stores a plurality of predictions. The global history buffer stores actual results of previous memory accesses as one of cache hits and cache misses. Additional steps include scheduling at least one additional one of the plurality of instructions in accordance with the predicting; and updating the pattern history table and the global history buffer subsequent to actual execution of the particular instruction in an execution stage of the hardware processor pipeline, to reflect whether the predicting was accurate.
摘要翻译：在硬件处理器流水线的解码阶段，解码多个指令的一个特定指令。确定特定指令需要存储器访问。响应于这种确定，预测存储器访问是否将导致高速缓存未命中。预测依次包括在解码级中存储为硬件表的模式历史表中访问多个条目中的一个条目。访问至少部分地基于全球历史缓冲区中的至少最近的条目。模式历史表存储多个预测。全局历史缓冲区将先前存储器访问的实际结果存储为高速缓存命中和缓存未命中之一。附加步骤包括根据预测调度多个指令中的至少一个附加的指令; 以及在硬件处理器管线的执行阶段中的特定指令的实际执行之后更新模式历史表和全局历史缓冲器，以反映预测是否准确。

59. 发明申请

US20110225401A1 PREFETCHING BRANCH PREDICTION MECHANISMS 有权
标题翻译：预分支预测机制
公开(公告)号：US20110225401A1
公开(公告)日：2011-09-15
申请号：US12721933
申请日：2010-03-11
申请人： Philip G. Emma , Allan M. Hartstein , Brian R. Prasky , Thomas R. Puzak , Vijayalakshmi Srinivasan
发明人： Philip G. Emma , Allan M. Hartstein , Brian R. Prasky , Thomas R. Puzak , Vijayalakshmi Srinivasan
IPC分类号： G06F9/38
CPC分类号： G06F9/3844 , G06F9/30058 , G06F9/30145 , G06F9/3846 , G06F9/3848
摘要： A method comprising receiving a branch instruction, decoding a branch address and the branch instruction, executing a branch action associated with the branch address, determining whether a branch associated with the branch action was taken, and saving an identifier of the branch instruction and in indicator that the branch action was taken in a prefetch history table responsive to determining that the branch associated with the branch action was taken.
摘要翻译：一种方法，包括接收分支指令，解码分支地址和分支指令，执行与分支地址相关联的分支动作，确定是否采用与分支动作相关联的分支，以及保存分支指令的标识符和指示符响应于确定与分支动作相关联的分支被采取，分支动作被采取在预取历史表中。

60. 发明授权

US07979682B2 Method and system for preventing livelock due to competing updates of prediction information 有权
标题翻译：由于预测信息的竞争更新而防止活动锁定的方法和系统
公开(公告)号：US07979682B2
公开(公告)日：2011-07-12
申请号：US12051322
申请日：2008-03-19
申请人： Erik R. Altman , Vijayalakshmi Srinivasan
发明人： Erik R. Altman , Vijayalakshmi Srinivasan
IPC分类号： G06F9/30
CPC分类号： G06F9/528 , G06F9/30043 , G06F9/3842 , G06F9/3861
摘要： A system to prevent livelock. An outcome of an event is predicted to form an event outcome prediction. The event outcome prediction is compared with a correct value for a datum to be accessed. An instruction is appended with a real event outcome when the outcome of the event is mispredicted to form an appended instruction. A prediction override bit is set on the appended instruction. Then, the appended instruction is executed with the real event outcome.
摘要翻译：防止活动锁的系统。预测事件的结果将形成事件结果预测。将事件结果预测与要访问的基准的正确值进行比较。当事件的结果被错误预测以形成附加的指令时，指令将附加真实的事件结果。在附加的指令上设置预测覆盖位。然后，附加的指令与实际事件结果一起执行。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式