会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 3. 发明申请
    • Efficient On-Chip Accelerator Interfaces to Reduce Software Overhead
    • 高效的片上加速器接口,以减少软件开销
    • US20080222383A1
    • 2008-09-11
    • US11684358
    • 2007-03-09
    • Lawrence A. SpracklenSantosh G. AbrahamAdam R. Talcott
    • Lawrence A. SpracklenSantosh G. AbrahamAdam R. Talcott
    • G06F9/34
    • G06F12/1027G06F12/1036G06F2212/1024G06F2212/683
    • In one embodiment, a processor comprises execution circuitry and a translation lookaside buffer (TLB) coupled to the execution circuitry. The execution circuitry is configured to execute a store instruction having a data operand; and the execution circuitry is configured to generate a virtual address as part of executing the store instruction. The TLB is coupled to receive the virtual address and configured to translate the virtual address to a first physical address. Additionally, the TLB is coupled to receive the data operand and to translate the data operand to a second physical address. A hardware accelerator is also contemplated in various embodiments, as is a processor coupled to the hardware accelerator, a method, and a computer readable medium storing instruction which, when executed, implement a portion of the method.
    • 在一个实施例中,处理器包括耦合到执行电路的执行电路和转换后备缓冲器(TLB)。 执行电路被配置为执行具有数据操作数的存储指令; 并且所述执行电路被配置为生成作为执行所述存储指令的一部分的虚拟地址。 所述TLB被耦合以接收所述虚拟地址并被配置为将所述虚拟地址转换为第一物理地址。 此外,TLB被耦合以接收数据操作数并将数据操作数转换为第二物理地址。 还可以在各种实施例中考虑硬件加速器,以及耦合到硬件加速器的处理器,方法和存储指令的计算机可读介质,其在执行时实现该方法的一部分。
    • 4. 发明授权
    • Efficient on-chip accelerator interfaces to reduce software overhead
    • 高效的片上加速器接口,以减少软件开销
    • US07827383B2
    • 2010-11-02
    • US11684358
    • 2007-03-09
    • Lawrence A. SpracklenSantosh G. AbrahamAdam R. Talcott
    • Lawrence A. SpracklenSantosh G. AbrahamAdam R. Talcott
    • G06F9/34G06F12/08
    • G06F12/1027G06F12/1036G06F2212/1024G06F2212/683
    • In one embodiment, a processor comprises execution circuitry and a translation lookaside buffer (TLB) coupled to the execution circuitry. The execution circuitry is configured to execute a store instruction having a data operand; and the execution circuitry is configured to generate a virtual address as part of executing the store instruction. The TLB is coupled to receive the virtual address and configured to translate the virtual address to a first physical address. Additionally, the TLB is coupled to receive the data operand and to translate the data operand to a second physical address. A hardware accelerator is also contemplated in various embodiments, as is a processor coupled to the hardware accelerator, a method, and a computer readable medium storing instruction which, when executed, implement a portion of the method.
    • 在一个实施例中,处理器包括耦合到执行电路的执行电路和转换后备缓冲器(TLB)。 执行电路被配置为执行具有数据操作数的存储指令; 并且所述执行电路被配置为生成作为执行所述存储指令的一部分的虚拟地址。 所述TLB被耦合以接收所述虚拟地址并被配置为将所述虚拟地址转换为第一物理地址。 此外,TLB被耦合以接收数据操作数并将数据操作数转换为第二物理地址。 还可以在各种实施例中考虑硬件加速器,以及耦合到硬件加速器的处理器,方法和存储指令的计算机可读介质,其在被执行时实现该方法的一部分。
    • 5. 发明授权
    • Missing store operation accelerator
    • 缺少商店操作加速器
    • US07757047B2
    • 2010-07-13
    • US11271056
    • 2005-11-12
    • Santosh G. AbrahamLawrence A. SpracklenYuan C. Chou
    • Santosh G. AbrahamLawrence A. SpracklenYuan C. Chou
    • G06F13/16
    • G06F12/0859
    • Maintaining a cache of indications of exclusively-owned coherence state for memory space units (e.g., cache line) allows reduction, if not elimination, of delay from missing store operations. In addition, the indications are maintained without corresponding data of the memory space unit, thus allowing representation of a large memory space with a relatively small missing store operation accelerator. With the missing store operation accelerator, a store operation, which misses in low-latency memory (e.g., L1 or L2 cache), proceeds as if the targeted memory space unit resides in the low-latency memory, if indicated in the missing store operation accelerator. When a store operation misses in low-latency memory and hits in the accelerator, a positive acknowledgement is transmitted to the writing processing unit allowing the store operation to proceed. An entry is allocated for the store operation, the store data is written into the allocated entry, and the target of the store operation is requested from memory. When a copy of the data at the requested memory space unit returns, the rest of the allocated entry is updated.
    • 维护用于存储器空间单元(例如,高速缓存行)的专有相干状态的指示的缓存允许减少(如果不是消除)缺失存储操作的延迟。 此外,在没有存储器空间单元的相应数据的情况下维持指示,从而允许用相对较小的缺少存储操作加速器来表示大的存储空间。 在缺少存储操作加速器的情况下,在低延迟存储器(例如L1或L2高速缓存)中丢失的存储操作如同目标存储器空间单元驻留在低延迟存储器中那样进行,如果在缺少的存储操作 加速器。 当存储操作在低延迟存储器中错过并且在加速器中点击时,肯定确认被发送到写入处理单元,从而允许存储操作继续进行。 为存储操作分配条目,将存储数据写入分配的条目,并且从存储器请求存储操作的目标。 当所请求的存储器空间单元上的数据的副本返回时,所分配的条目的其余部分被更新。
    • 6. 发明授权
    • Execution displacement read-write alias prediction
    • 执行位移读写别名预测
    • US07434031B1
    • 2008-10-07
    • US10822390
    • 2004-04-12
    • Lawrence A. SpracklenSantosh G. AbrahamStevan Vlaovic
    • Lawrence A. SpracklenSantosh G. AbrahamStevan Vlaovic
    • G06F9/30G06F9/40G06F15/00
    • G06F9/3826G06F9/3834G06F9/3838G06F9/384G06F9/3842G06F9/3857
    • RAW aliasing can be predicted with register bypassing based at least in part on execution displacement alias prediction. Repeated aliasing between read and write operations (e.g., within a loop), can be reliably predicted based on displacement between the aliasing operations. Performing register bypassing for predicted to alias operations facilitates faster RAW bypassing and mitigates the performance impact of aliasing read operations. The repeated aliasing between operations is tracked along with register information of the aliasing write operations. After exceeding a confidence threshold, an instance of a read operation is predicted to alias with an instance of a write operation in accordance with the previously observed repeated aliasing. Based on displacement between the instances of the operations, the register information of the write operation instance is used to bypass data to the read operation instance.
    • 可以至少部分地基于执行位移别名预测,通过寄存器旁路预测RAW混叠。 可以基于混叠操作之间的位移来可靠地预测读和写操作之间的重复混叠(例如,在循环内)。 执行寄存器旁路以预测别名操作有助于更快的RAW旁路,并减轻混叠读操作的性能影响。 操作之间的重复混叠跟踪混叠写操作的寄存器信息。 在超过置信阈值之后,根据先前观察到的重复混叠,预测读取操作的实例与写入操作的实例相混淆。 基于操作实例之间的位移,写操作实例的寄存器信息用于将数据旁路到读操作实例。
    • 7. 发明授权
    • Software-based technique for improving the effectiveness of prefetching during scout mode
    • 基于软件的技术,用于提高侦察模式下预取的有效性
    • US07373482B1
    • 2008-05-13
    • US11139708
    • 2005-05-26
    • Lawrence A. SpracklenYuan C. ChouSantosh G. Abraham
    • Lawrence A. SpracklenYuan C. ChouSantosh G. Abraham
    • G06F9/30G06F9/40G06F15/00
    • G06F9/3842G06F9/30043G06F9/30105G06F9/30181G06F9/30189G06F9/3802G06F9/3814G06F9/383G06F9/3838G06F9/3863
    • One embodiment of the present invention provides a system that improves the effectiveness of prefetching during execution of instructions in scout mode. During operation, the system executes program instructions in a normal-execution mode. Upon encountering a condition which causes the processor to enter scout mode, the system performs a checkpoint and commences execution of instructions in scout mode, wherein the instructions are speculatively executed to prefetch future memory operations, but wherein results are not committed to the architectural state of a processor. During execution of a load instruction during scout mode, if the load instruction is a special load instruction and if the load instruction causes a lower-level cache miss, the system waits for data to be returned from a higher-level cache before resuming execution of subsequent instructions in scout mode, instead of disregarding the result of the load instruction and immediately resuming execution in scout mode. In this way, the data returned from the higher-level cache can help in generating addresses for subsequent prefetches during scout mode.
    • 本发明的一个实施例提供了一种提高在侦察模式下执行指令期间预取的有效性的系统。 在运行期间,系统以正常执行模式执行程序指令。 当遇到导致处理器进入侦察模式的情况时,系统执行检查点并开始执行侦察模式中的指令,其中推测性地执行指令以预取将来的存储器操作,但是其中结果未被提交到建筑状态 一个处理器 在侦察模式期间执行加载指令期间,如果加载指令是特殊加载指令,如果加载指令导致较低级别的高速缓存未命中,则系统等待从更高级别的缓存返回数据,然后恢复执行 随后在侦察模式下的指令,而不是忽略加载指令的结果,并立即恢复执行侦察模式。 以这种方式,从高级缓存返回的数据可以帮助在侦察模式期间为后续预取生成地址。
    • 8. 发明授权
    • Efficient caching of stores in scalable chip multi-threaded systems
    • 在可扩展芯片多线程系统中高效缓存存储
    • US07793044B1
    • 2010-09-07
    • US11654150
    • 2007-01-16
    • Lawrence A. SpracklenYuan C. ChouSantosh G. Abraham
    • Lawrence A. SpracklenYuan C. ChouSantosh G. Abraham
    • G06F13/00G06F13/28
    • G06F12/0811G06F12/084
    • In accordance with one embodiment, an enhanced chip multiprocessor permits an L1 cache to request ownership of a data line from a shared L2 cache. A determination is made whether to deny or grant the request for ownership based on the sharing of the data line. In one embodiment, the sharing of the data line is determined from an enhanced L2 cache directory entry associated with the data line. If ownership of the data line is granted, the current data line is passed from the shared L2 to the requesting L1 cache and an associated enhanced L1 cache directory entry and the enhanced L2 cache directory entry are updated to reflect the L1 cache ownership of the data line. Consequently, updates of the data line by the L1 cache do not go through the shared L2 cache, thus reducing transaction pressure on the shared L2 cache.
    • 根据一个实施例,增强型芯片多处理器允许L1高速缓存从共享L2高速缓存请求数据线的所有权。 确定是否根据数据线的共享来拒绝或授予所有权请求。 在一个实施例中,从与数据线相关联的增强的L2高速缓存目录条目确定数据线的共享。 如果数据线的所有权被授予,则当前数据行从共享L2传递到请求的L1高速缓存,并且相关联的增强的L1高速缓存目录条目和增强的L2高速缓存目录条目被更新以反映数据的L1高速缓存所有权 线。 因此,L1高速缓存的数据线的更新不会通过共享的L2高速缓存,从而降低共享L2高速缓存上的事务压力。
    • 10. 发明授权
    • Hardware-based technique for improving the effectiveness of prefetching during scout mode
    • 基于硬件的技术,用于提高侦察模式下预取的有效性
    • US07529911B1
    • 2009-05-05
    • US11139866
    • 2005-05-26
    • Lawrence A. SpracklenYuan C. ChouSantosh G. Abraham
    • Lawrence A. SpracklenYuan C. ChouSantosh G. Abraham
    • G06F9/30G06F9/40G06F15/00
    • G06F9/3842G06F9/383G06F9/3834G06F9/3838G06F9/3861
    • One embodiment of the present invention provides a system that improves the effectiveness of prefetching during execution of instructions in scout mode. Upon encountering a non-data dependent stall condition, the system performs a checkpoint and commences execution of instructions in scout mode, wherein instructions are speculatively executed to prefetch future memory operations, but wherein results are not committed to the architectural state of a processor. When the system executes a load instruction during scout mode, if the load instruction causes a lower-level cache miss, the system allows the load instruction to access a higher-level cache. Next, the system places the load instruction and subsequent dependent instructions into a deferred queue, and resumes execution of the program in scout mode. If the load instruction ultimately causes a hit in the higher-level cache, the system replays the load instruction and subsequent dependent instructions in the deferred queue, whereby the value retrieved from the higher-level cache can help in generating prefetches during scout mode.
    • 本发明的一个实施例提供一种提高在侦察模式下执行指令期间预取的有效性的系统。 在遇到非数据相关失速条件时,系统执行检查点并开始执行侦察模式中的指令,其中推测性地执行指令以预取未来的存储器操作,但是其中结果未被提交到处理器的架构状态。 当系统在侦察模式下执行加载指令时,如果加载指令导致较低级别的高速缓存未命中,则系统允许加载指令访问更高级别的缓存。 接下来,系统将加载指令和后续相关指令放入延迟队列中,并以侦察模式恢复执行程序。 如果加载指令最终导致高级缓存中的命中,则系统在延迟队列中重放加载指令和后续相关指令,由此从较高级别缓存中检索的值可以帮助在侦察模式下产生预取。