会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 51. 发明申请
    • Handling permanent and transient errors using a SIMD unit
    • 使用SIMD单元处理永久和瞬态错误
    • US20060190700A1
    • 2006-08-24
    • US11063122
    • 2005-02-22
    • Erik AltmanGheorghe CascavalLuis CezeVijayalakshmi Srinivasan
    • Erik AltmanGheorghe CascavalLuis CezeVijayalakshmi Srinivasan
    • G06F15/00
    • G06F11/1641
    • A method for handling permanent and transient errors in a microprocessor is disclosed. The method includes reading a scalar value and a scalar operation from an execution unit of the microprocessor. The method further includes writing a copy of the scalar value into each of a plurality of elements of a vector register of a Single Instruction Multiple Data (SIMD) unit of the microprocessor and executing the scalar operation on each scalar value in each of the plurality of elements of the vector register of the SIMED unit using a vector operation. The method further includes comparing each result of the scalar operation on each scalar value in each of the plurality of elements of the vector register and detecting a permanent or transient error if all of the results are not identical.
    • 公开了一种用于处理微处理器中的永久和瞬态误差的方法。 该方法包括从微处理器的执行单元读取标量值和标量运算。 该方法还包括将标量值的副本写入微处理器的单指令多数据(SIMD)单元的向量寄存器的多个元素中的每一个元素中,并对多个数据单元中的每一个的每个标量值执行标量运算 使用向量操作的SIMED单元的向量寄存器的元素。 所述方法还包括:对所述向量寄存器的所述多个元素中的每一个中的每个标量值进行标量运算的每个结果,如果所有结果不相同,则检测永久或瞬态错误。
    • 53. 发明授权
    • Relaxation of synchronization for iterative convergent computations
    • 放松迭代收敛计算的同步
    • US09069545B2
    • 2015-06-30
    • US13184718
    • 2011-07-18
    • Lakshminarayanan RenganarayanaVijayalakshmi Srinivasan
    • Lakshminarayanan RenganarayanaVijayalakshmi Srinivasan
    • G06F9/44G06F9/30G06F9/52G06F9/45
    • G06F9/3004G06F8/458G06F9/30087G06F9/30185G06F9/52
    • Systems and methods are disclosed that allow atomic updates to global data to be at least partially eliminated to reduce synchronization overhead in parallel computing. A compiler analyzes the data to be processed to selectively permit unsynchronized data transfer for at least one type of data. A programmer may provide a hint to expressly identify the type of data that are candidates for unsynchronized data transfer. In one embodiment, the synchronization overhead is reducible by generating an application program that selectively substitutes codes for unsynchronized data transfer for a subset of codes for synchronized data transfer. In another embodiment, the synchronization overhead is reducible by employing a combination of software and hardware by using relaxation data registers and decoders that collectively convert a subset of commands for synchronized data transfer into commands for unsynchronized data transfer.
    • 公开了允许至少部分地消除全局数据的原子更新以减少并行计算中的同步开销的系统和方法。 编译器分析要处理的数据,以选择性地允许至少一种类型的数据的不同步数据传输。 程序员可以提供明确识别作为不同步数据传输候选的数据类型的提示。 在一个实施例中,可以通过生成一个应用程序来减少同步开销,所述应用程序选择性地替代用于同步数据传输的代码子集的非同步数据传输的代码。 在另一个实施例中,可以通过使用松弛数据寄存器和解码器来将软件和硬件的组合应用于将用于同步数据传输的命令的子集合转换成用于非同步数据传输的命令来减少同步开销。
    • 55. 发明授权
    • Write-through cache optimized for dependence-free parallel regions
    • 针对无依赖并行区域优化的直写缓存
    • US08516197B2
    • 2013-08-20
    • US13025706
    • 2011-02-11
    • Alexandre E. EichenbergerAlan G. GaraMartin OhmachtVijayalakshmi Srinivasan
    • Alexandre E. EichenbergerAlan G. GaraMartin OhmachtVijayalakshmi Srinivasan
    • G06F12/00
    • G06F12/0837
    • An apparatus, method and computer program product for improving performance of a parallel computing system. A first hardware local cache controller associated with a first local cache memory device of a first processor detects an occurrence of a false sharing of a first cache line by a second processor running the program code and allows the false sharing of the first cache line by the second processor. The false sharing of the first cache line occurs upon updating a first portion of the first cache line in the first local cache memory device by the first hardware local cache controller and subsequent updating a second portion of the first cache line in a second local cache memory device by a second hardware local cache controller.
    • 一种用于提高并行计算系统性能的装置,方法和计算机程序产品。 与第一处理器的第一本地高速缓冲存储器设备相关联的第一硬件本地高速缓存控制器通过运行程序代码的第二处理器检测出第一高速缓存行的虚假共享的发生,并允许第一高速缓存行的错误共享由 第二处理器。 当由第一硬件本地高速缓存控制器更新第一本地高速缓存存储器设备中的第一高速缓存行的第一部分并且随后在第二本地高速缓冲存储器中更新第一高速缓存行的第二部分时,发生第一高速缓存行的错误共享 设备由第二硬件本地缓存控制器。
    • 57. 发明申请
    • RELAXATION OF SYNCHRONIZATION FOR ITERATIVE CONVERGENT COMPUTATIONS
    • 用于迭代融合计算的同步放松
    • US20130024662A1
    • 2013-01-24
    • US13184718
    • 2011-07-18
    • Lakshminarayanan RenganarayanaVijayalakshmi Srinivasan
    • Lakshminarayanan RenganarayanaVijayalakshmi Srinivasan
    • G06F9/30
    • G06F9/3004G06F8/458G06F9/30087G06F9/30185G06F9/52
    • Systems and methods are disclosed that allow atomic updates to global data to be at least partially eliminated to reduce synchronization overhead in parallel computing. A compiler analyzes the data to be processed to selectively permit unsynchronized data transfer for at least one type of data. A programmer may provide a hint to expressly identify the type of data that are candidates for unsynchronized data transfer. In one embodiment, the synchronization overhead is reducible by generating an application program that selectively substitutes codes for unsynchronized data transfer for a subset of codes for synchronized data transfer. In another embodiment, the synchronization overhead is reducible by employing a combination of software and hardware by using relaxation data registers and decoders that collectively convert a subset of commands for synchronized data transfer into commands for unsynchronized data transfer.
    • 公开了允许至少部分地消除全局数据的原子更新以减少并行计算中的同步开销的系统和方法。 编译器分析要处理的数据,以选择性地允许至少一种类型的数据的不同步数据传输。 程序员可以提供明确识别作为不同步数据传输候选的数据类型的提示。 在一个实施例中,可以通过生成一个应用程序来减少同步开销,所述应用程序选择性地替代用于同步数据传输的代码子集的非同步数据传输的代码。 在另一个实施例中,可以通过使用松弛数据寄存器和解码器来将软件和硬件的组合应用于将用于同步数据传输的命令的子集合转换成用于非同步数据传输的命令来减少同步开销。
    • 58. 发明申请
    • PREDICTING CACHE MISSES USING DATA ACCESS BEHAVIOR AND INSTRUCTION ADDRESS
    • 使用数据访问行为和指令地址预测高速缓存错误
    • US20120284463A1
    • 2012-11-08
    • US13099178
    • 2011-05-02
    • Vijayalakshmi SrinivasanBrian R. Prasky
    • Vijayalakshmi SrinivasanBrian R. Prasky
    • G06F12/08
    • G06F9/383G06F9/3832G06F9/3836G06F9/3844G06F9/3851
    • In a decode stage of hardware processor pipeline, one particular instruction of a plurality of instructions is decoded. It is determined that the particular instruction requires a memory access. Responsive to such determination, it is predicted whether the memory access will result in a cache miss. The predicting in turn includes accessing one of a plurality of entries in a pattern history table stored as a hardware table in the decode stage. The accessing is based, at least in part, upon at least a most recent entry in a global history buffer. The pattern history table stores a plurality of predictions. The global history buffer stores actual results of previous memory accesses as one of cache hits and cache misses. Additional steps include scheduling at least one additional one of the plurality of instructions in accordance with the predicting; and updating the pattern history table and the global history buffer subsequent to actual execution of the particular instruction in an execution stage of the hardware processor pipeline, to reflect whether the predicting was accurate.
    • 在硬件处理器流水线的解码阶段,解码多个指令的一个特定指令。 确定特定指令需要存储器访问。 响应于这种确定,预测存储器访问是否将导致高速缓存未命中。 预测依次包括在解码级中存储为硬件表的模式历史表中访问多个条目中的一个条目。 访问至少部分地基于全球历史缓冲区中的至少最近的条目。 模式历史表存储多个预测。 全局历史缓冲区将先前存储器访问的实际结果存储为高速缓存命中和缓存未命中之一。 附加步骤包括根据预测调度多个指令中的至少一个附加的指令; 以及在硬件处理器管线的执行阶段中的特定指令的实际执行之后更新模式历史表和全局历史缓冲器,以反映预测是否准确。