专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US09058180B2 Unified high-frequency out-of-order pick queue with support for triggering early issue of speculative instructions 有权
标题翻译：统一的高频无序拣选队列，支持触发早期发布的投机指令
公开(公告)号：US09058180B2
公开(公告)日：2015-06-16
申请号：US12493743
申请日：2009-06-29
申请人： Robert T. Golla , Matthew B. Smittle , Mark A. Luttrell , Xiang Shan Li
发明人： Robert T. Golla , Matthew B. Smittle , Mark A. Luttrell , Xiang Shan Li
IPC分类号： G06F15/00 , G06F9/30 , G06F9/40 , G06F9/38
CPC分类号： G06F9/3838 , G06F9/3842 , G06F9/3851 , G06F9/3855 , G06F9/3857
摘要： Systems and methods for efficient picking of instructions for out-of-order issue and execution in a processor. In one embodiment, a processor comprises a unified pick queue that is dynamically allocated. Each entry is configured to store age and dependency information relative to other decoded instructions. Also, each entry stores a picked field, which when asserted indicates the decoded instruction has already been picked for out-of-order issue and execution. When asserted, a trigger field indicates a result of a corresponding decoded instruction will be available a predetermined number of clock cycles afterward. A younger instruction dependent on a result of an older instruction is ready to be picked before the result of the older instruction is available. In this case, the older instruction has asserted picked and trigger fields.
摘要翻译：用于在处理器中有效挑选无序问题和执行指令的系统和方法。在一个实施例中，处理器包括动态分配的统一选择队列。每个条目被配置为存储相对于其他解码指令的年龄和依赖性信息。此外，每个条目存储拾取的字段，当被断言指示解码的指令已被选择用于无序发行和执行时。当被确认时，触发字段指示相应的解码指令的结果将在预定数量的时钟周期之后可用。在较老指令的结果可用之前，可以选择取决于旧指令结果的年轻指令。在这种情况下，较旧的指令已经断言了选择和触发字段。

2. 发明申请

US20100332804A1 UNIFIED HIGH-FREQUENCY OUT-OF-ORDER PICK QUEUE WITH SUPPORT FOR SPECULATIVE INSTRUCTIONS 有权
标题翻译：统一的高频无排序抽奖活动支持用于指导性说明
公开(公告)号：US20100332804A1
公开(公告)日：2010-12-30
申请号：US12493743
申请日：2009-06-29
申请人： Robert T. Golla , Matthew B. Smittle , Mark A. Luttrell , Xiang Shan Li
发明人： Robert T. Golla , Matthew B. Smittle , Mark A. Luttrell , Xiang Shan Li
IPC分类号： G06F9/30
CPC分类号： G06F9/3838 , G06F9/3842 , G06F9/3851 , G06F9/3855 , G06F9/3857
摘要： Systems and methods for efficient picking of instructions for out-of-order issue and execution in a processor. In one embodiment, a processor comprises a unified pick queue that is dynamically allocated. Each entry is configured to store age and dependency information relative to other decoded instructions. Also, each entry stores a picked field, which when asserted indicates the decoded instruction has already been picked for out-of-order issue and execution. When asserted, a trigger field indicates a result of a corresponding decoded instruction will be available a predetermined number of clock cycles afterward. A younger instruction dependent on a result of an older instruction is ready to be picked before the result of the older instruction is available. In this case, the older instruction has asserted picked and trigger fields.
摘要翻译：用于在处理器中有效挑选无序问题和执行指令的系统和方法。在一个实施例中，处理器包括动态分配的统一选择队列。每个条目被配置为存储相对于其他解码指令的年龄和依赖性信息。此外，每个条目存储拾取的字段，当被断言指示解码的指令已被选择用于无序发行和执行时。当被确认时，触发字段指示相应的解码指令的结果将在预定数量的时钟周期之后可用。在较老指令的结果可用之前，可以选择取决于旧指令结果的年轻指令。在这种情况下，较旧的指令已经断言了选择和触发字段。

3. 发明授权

US09262171B2 Dependency matrix for the determination of load dependencies 有权
标题翻译：用于确定负载依赖性的依赖矩阵
公开(公告)号：US09262171B2
公开(公告)日：2016-02-16
申请号：US12495025
申请日：2009-06-30
申请人： Robert T. Golla , Matthew B. Smittle , Xiang Shan Li
发明人： Robert T. Golla , Matthew B. Smittle , Xiang Shan Li
IPC分类号： G06F15/00 , G06F9/30 , G06F9/40 , G06F9/38
CPC分类号： G06F9/3842 , G06F9/3838 , G06F9/3851 , G06F9/3855 , G06F9/3857 , G06F9/3861
摘要： Systems and methods for identification of dependent instructions on speculative load operations in a processor. A processor allocates entries of a unified pick queue for decoded and renamed instructions. Each entry of a corresponding dependency matrix is configured to store a dependency bit for each other instruction in the pick queue. The processor speculates that loads will hit in the data cache, hit in the TLB and not have a read after write (RAW) hazard. For each unresolved load, the pick queue tracks dependent instructions via dependency vectors based upon the dependency matrix. If a load speculation is found to be incorrect, dependent instructions in the pick queue are reset to allow for subsequent picking, and dependent instructions in flight are canceled. On completion of a load miss, dependent operations are re-issued. On resolution of a TLB miss or RAW hazard, the original load is replayed and dependent operations are issued again from the pick queue.
摘要翻译：用于识别处理器中推测加载操作的依赖指令的系统和方法。处理器为解码和重新命名的指令分配统一挑选队列的条目。相应的依赖矩阵的每个条目被配置为在拾取队列中存储每个其他指令的依赖位。处理器推测负载将在数据高速缓存中击中，在TLB中触发，写入（RAW）危险后不会有读取。对于每个未解决的负载，拾取队列基于依赖矩阵通过依赖向量跟踪相关指令。如果发现负载推测不正确，则选择队列中的相关指令将被重置，以允许随后的拣配，并取消飞行中的相关指令。完成负载错误后，重新发行依赖操作。在解决TLB错误或RAW危险时，将重新起始原始负载，并从拾取队列再次发出依赖操作。

4. 发明申请

US20100332806A1 DEPENDENCY MATRIX FOR THE DETERMINATION OF LOAD DEPENDENCIES 有权
标题翻译：用于确定负载依赖性的依赖矩阵
公开(公告)号：US20100332806A1
公开(公告)日：2010-12-30
申请号：US12495025
申请日：2009-06-30
申请人： Robert T. Golla , Matthew B. Smittle , Xiang Shan Li
发明人： Robert T. Golla , Matthew B. Smittle , Xiang Shan Li
IPC分类号： G06F9/30
CPC分类号： G06F9/3842 , G06F9/3838 , G06F9/3851 , G06F9/3855 , G06F9/3857 , G06F9/3861
摘要： Systems and methods for identification of dependent instructions on speculative load operations in a processor. A processor allocates entries of a unified pick queue for decoded and renamed instructions. Each entry of a corresponding dependency matrix is configured to store a dependency bit for each other instruction in the pick queue. The processor speculates that loads will hit in the data cache, hit in the TLB and not have a read after write (RAW) hazard. For each unresolved load, the pick queue tracks dependent instructions via dependency vectors based upon the dependency matrix. If a load speculation is found to be incorrect, dependent instructions in the pick queue are reset to allow for subsequent picking, and dependent instructions in flight are canceled. On completion of a load miss, dependent operations are re-issued. On resolution of a TLB miss or RAW hazard, the original load is replayed and dependent operations are issued again from the pick queue.
摘要翻译：用于识别处理器中推测加载操作的依赖指令的系统和方法。处理器为解码和重新命名的指令分配统一挑选队列的条目。相应的依赖矩阵的每个条目被配置为在拾取队列中存储每个其他指令的依赖位。处理器推测负载将在数据高速缓存中击中，在TLB中触发，写入（RAW）危险后不会有读取。对于每个未解决的负载，拾取队列基于依赖矩阵通过依赖向量跟踪相关指令。如果发现负载推测不正确，则选择队列中的相关指令将被重置，以允许随后的拣配，并取消飞行中的相关指令。完成负载错误后，重新发行依赖操作。在解决TLB错误或RAW危险时，将重新起始原始负载，并从拾取队列再次发出依赖操作。

5. 发明申请

US20110029978A1 DYNAMIC MITIGATION OF THREAD HOGS ON A THREADED PROCESSOR 有权
标题翻译：螺纹加工器上的螺纹动作的动态减速
公开(公告)号：US20110029978A1
公开(公告)日：2011-02-03
申请号：US12511620
申请日：2009-07-29
申请人： Jared C. Smolens , Robert T. Golla , Matthew B. Smittle
发明人： Jared C. Smolens , Robert T. Golla , Matthew B. Smittle
IPC分类号： G06F9/46
CPC分类号： G06F9/5016 , G06F2209/504 , G06F2209/507 , Y02D10/22
摘要： Systems and methods for efficient thread arbitration in a processor. A processor comprises a multi-threaded resource. The resource may include an array 8of entries which may be allocated by threads. A thread arbitration table corresponding to a given thread stores a high and a low threshold value in each table entry. A thread history shift register (HSR) indexes the table, wherein each bit of the HSR indicates whether the given thread is a thread hog. When the given thread has more allocated entries in the array than the high threshold of the table entry, the given thread is stalled from further allocating array entries. Similarly, when the given thread has fewer allocated entries in the array than the low threshold of the selected table entry, the given thread is permitted to allocate entries. In this manner, threads that hog dynamic resources can be mitigated such that more resources are available to other threads that are not thread hogs. This can result in a significant increase in overall processor performance.
摘要翻译：处理器中有效线程仲裁的系统和方法。处理器包括多线程资源。资源可以包括可由线程分配的条目的数组8。对应于给定线程的线程仲裁表在每个表条目中存储高和低阈值。线程历史移位寄存器（HSR）对表进行索引，其中HSR的每个位指示给定线程是否是线程号。当给定的线程在数组中具有比表条目的高阈值更多的分配条目时，给定线程从进一步分配数组条目停止。类似地，当给定的线程在数组中的分配的条目少于所选表条目的低阈值时，允许给定的线程分配条目。以这种方式，可以减轻动态资源的线程，使得更多的资源可用于不是线程的其他线程。这可能导致整体处理器性能的显着增加。

6. 发明申请

US20110078697A1 OPTIMAL DEALLOCATION OF INSTRUCTIONS FROM A UNIFIED PICK QUEUE 有权
标题翻译：从一个统一的PICK QUEUE的指示的最佳决定
公开(公告)号：US20110078697A1
公开(公告)日：2011-03-31
申请号：US12571200
申请日：2009-09-30
申请人： Matthew B. Smittle , Robert T. Golla
发明人： Matthew B. Smittle , Robert T. Golla
IPC分类号： G06F9/46
CPC分类号： G06F9/3836 , G06F9/3838 , G06F9/384 , G06F9/3842 , G06F9/3851 , G06F9/3857
摘要： Systems and methods for efficient out-of-order dynamic deallocation of entries within a shared storage resource in a processor. A processor comprises a unified pick queue that includes an array configured to dynamically allocate any entry of a plurality of entries for a decoded and renamed instruction. This instruction may correspond to any available active threads supported by the processor. The processor includes circuitry configured to determine whether an instruction corresponding to an allocated entry of the plurality of entries is dependent on a speculative instruction and whether the instruction has a fixed instruction execution latency. In response to determining the instruction is not dependent on a speculative instruction, the instruction has a fixed instruction execution latency, and said latency has transpired, the circuitry may deallocate the instruction from the allocated entry.
摘要翻译：用于处理器中共享存储资源内的条目的有效无序动态释放的系统和方法。处理器包括统一选择队列，其包括被配置为动态分配用于解码和重命名指令的多个条目的任何条目的阵列。该指令可对应于处理器支持的任何可用的活动线程。所述处理器包括被配置为确定与所述多个条目中所分配的条目相对应的指令是否取决于推测指令以及所述指令是否具有固定指令执行等待时间的电路。响应于确定指令不依赖于推测性指令，指令具有固定的指令执行延迟，并且所述等待时间已经发生，电路可能从分配的条目释放指令。

7. 发明授权

US08347309B2 Dynamic mitigation of thread hogs on a threaded processor 有权
标题翻译：在线程处理器上线性猪的动态减轻
公开(公告)号：US08347309B2
公开(公告)日：2013-01-01
申请号：US12511620
申请日：2009-07-29
申请人： Jared C. Smolens , Robert T. Golla , Matthew B. Smittle
发明人： Jared C. Smolens , Robert T. Golla , Matthew B. Smittle
IPC分类号： G06F9/46 , G06F9/30
CPC分类号： G06F9/5016 , G06F2209/504 , G06F2209/507 , Y02D10/22
摘要： Systems and methods for efficient thread arbitration in a processor. A processor comprises a multi-threaded resource. The resource may include an array of entries which may be allocated by threads. A thread arbitration table corresponding to a given thread stores a high and a low threshold value in each table entry. A thread history shift register (HSR) indexes the table, wherein each bit of the HSR indicates whether the given thread is a thread hog. When the given thread has more allocated entries in the array than the high threshold of the table entry, the given thread is stalled from further allocating array entries. Similarly, when the given thread has fewer allocated entries in the array than the low threshold of the selected table entry, the given thread is permitted to allocate entries. In this manner, threads that hog dynamic resources can be mitigated such that more resources are available to other threads that are not thread hogs. This can result in a significant increase in overall processor performance.
摘要翻译：处理器中有效线程仲裁的系统和方法。处理器包括多线程资源。资源可以包括可由线程分配的条目数组。对应于给定线程的线程仲裁表在每个表条目中存储高和低阈值。线程历史移位寄存器（HSR）对表进行索引，其中HSR的每个位指示给定线程是否是线程号。当给定的线程在数组中具有比表条目的高阈值更多的分配条目时，给定线程从进一步分配数组条目停止。类似地，当给定的线程在数组中的分配的条目少于所选表条目的低阈值时，允许给定的线程分配条目。以这种方式，可以减轻动态资源的线程，使得更多的资源可用于不是线程的其他线程。这可能导致整体处理器性能的显着增加。

8. 发明授权

US09286075B2 Optimal deallocation of instructions from a unified pick queue 有权
标题翻译：从统一的拣选队列中优化解除指令
公开(公告)号：US09286075B2
公开(公告)日：2016-03-15
申请号：US12571200
申请日：2009-09-30
申请人： Matthew B. Smittle , Robert T. Golla
发明人： Matthew B. Smittle , Robert T. Golla
IPC分类号： G06F9/38
CPC分类号： G06F9/3836 , G06F9/3838 , G06F9/384 , G06F9/3842 , G06F9/3851 , G06F9/3857
摘要： Systems and methods for efficient out-of-order dynamic deallocation of entries within a shared storage resource in a processor. A processor comprises a unified pick queue that includes an array configured to dynamically allocate any entry of a plurality of entries for a decoded and renamed instruction. This instruction may correspond to any available active threads supported by the processor. The processor includes circuitry configured to determine whether an instruction corresponding to an allocated entry of the plurality of entries is dependent on a speculative instruction and whether the instruction has a fixed instruction execution latency. In response to determining the instruction is not dependent on a speculative instruction, the instruction has a fixed instruction execution latency, and said latency has transpired, the circuitry may deallocate the instruction from the allocated entry.
摘要翻译：用于处理器中共享存储资源内的条目的有效无序动态释放的系统和方法。处理器包括统一选择队列，其包括被配置为动态分配用于解码和重命名指令的多个条目的任何条目的阵列。该指令可以对应于处理器支持的任何可用的活动线程。所述处理器包括被配置为确定与所述多个条目中所分配的条目相对应的指令是否取决于推测指令以及所述指令是否具有固定指令执行等待时间的电路。响应于确定指令不依赖于推测性指令，指令具有固定的指令执行延迟，并且所述等待时间已经发生，电路可能从分配的条目释放指令。

9. 发明授权

US08504805B2 Processor operating mode for mitigating dependency conditions between instructions having different operand sizes 有权
标题翻译：用于缓解具有不同操作数大小的指令之间的依赖条件的处理器操作模式
公开(公告)号：US08504805B2
公开(公告)日：2013-08-06
申请号：US12428464
申请日：2009-04-22
申请人： Robert T. Golla , Paul J. Jordan , Jama I. Barreh , Matthew B. Smittle , Yuan C. Chou , Jared C. Smolens
发明人： Robert T. Golla , Paul J. Jordan , Jama I. Barreh , Matthew B. Smittle , Yuan C. Chou , Jared C. Smolens
IPC分类号： G06F7/483
CPC分类号： G06F9/3838 , G06F9/30032 , G06F9/30109 , G06F9/30189
摘要： Various techniques for mitigating dependencies between groups of instructions are disclosed. In one embodiment, such dependencies include “evil twin” conditions, in which a first floating-point instruction has as a destination a first portion of a logical floating-point register (e.g., a single-precision write), and in which a second, subsequent floating-point instruction has as a source the first portion and a second portion of the same logical floating-point register (e.g., a double-precision read). The disclosed techniques may be applicable in a multithreaded processor implementing register renaming. In one embodiment, a processor may enter an operating mode in which detection of evil twin “producers” (e.g., single-precision writes) causes the instruction sequence to be modified to break potential dependencies. Modification of the instruction sequence may continue until one or more exit criteria are reached (e.g., committing a predetermined number of single-precision writes). This operating mode may be employed on a per-thread basis.
摘要翻译：公开了用于减轻指令组之间依赖性的各种技术。在一个实施例中，这种依赖性包括“恶双”条件，其中第一浮点指令具有作为目的地的逻辑浮点寄存器的第一部分（例如，单精度写入），并且其中第二浮点指令后续浮点指令作为源的相同逻辑浮点寄存器的第一部分和第二部分（例如，双精度读取）。所公开的技术可以适用于实现寄存器重命名的多线程处理器。在一个实施例中，处理器可以进入操作模式，在该操作模式中，恶意孪生“生产者”（例如，单精度写入）的检测导致指令序列被修改以破坏潜在依赖性。指令序列的修改可以继续，直到达到一个或多个退出标准（例如，提交预定数量的单精度写入）。该操作模式可以在每个线程的基础上使用。

10. 发明申请

US20130179664A1 DIVISION UNIT WITH MULTIPLE DIVIDE ENGINES 有权
标题翻译：具有多个引擎的部门
公开(公告)号：US20130179664A1
公开(公告)日：2013-07-11
申请号：US13345391
申请日：2012-01-06
申请人： Christopher H. Olson , Jeffrey S. Brooks , Matthew B. Smittle
发明人： Christopher H. Olson , Jeffrey S. Brooks , Matthew B. Smittle
IPC分类号： G06F7/487 , G06F9/38 , G06F7/537 , G06F9/302 , G06F5/01 , G06F9/30
CPC分类号： G06F9/3895 , G06F7/49936 , G06F7/535 , G06F7/5375 , G06F9/3001 , G06F9/3875 , G06F9/3885
摘要： Techniques are disclosed relating to integrated circuits that include hardware support for divide and/or square root operations. In one embodiment, an integrated circuit is disclosed that includes a division unit that, in turn, includes a normalization circuit and a plurality of divide engines. The normalization circuit is configured to normalize a set of operands. Each divide engine is configured to operate on a respective normalized set of operands received from the normalization circuit. In some embodiments, the integrated circuit includes a scheduler unit configured to select instructions for issuance to a plurality of execution units including the division unit. The scheduler unit is further configured to maintain a counter indicative of a number of instructions currently being operated on by the division unit, and to determine, based on the counter whether to schedule subsequent instructions for issuance to the division unit.
摘要翻译：公开了涉及包括用于划分和/或平方根操作的硬件支持的集成电路的技术。在一个实施例中，公开了一种集成电路，其包括分割单元，该分割单元又包括归一化电路和多个除法引擎。归一化电路被配置为归一化一组操作数。每个分频引擎被配置为对从归一化电路接收的相应的归一化操作数集进行操作。在一些实施例中，集成电路包括调度器单元，其被配置为选择用于向包括该分割单元的多个执行单元发布的指令。调度器单元还被配置为保持指示当前正在由分割单元操作的指令的数量的计数器，并且基于计数器确定是否计划用于发布到分割单元的后续指令。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式