会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 3. 发明授权
    • Dependency matrix for the determination of load dependencies
    • 用于确定负载依赖性的依赖矩阵
    • US09262171B2
    • 2016-02-16
    • US12495025
    • 2009-06-30
    • Robert T. GollaMatthew B. SmittleXiang Shan Li
    • Robert T. GollaMatthew B. SmittleXiang Shan Li
    • G06F15/00G06F9/30G06F9/40G06F9/38
    • G06F9/3842G06F9/3838G06F9/3851G06F9/3855G06F9/3857G06F9/3861
    • Systems and methods for identification of dependent instructions on speculative load operations in a processor. A processor allocates entries of a unified pick queue for decoded and renamed instructions. Each entry of a corresponding dependency matrix is configured to store a dependency bit for each other instruction in the pick queue. The processor speculates that loads will hit in the data cache, hit in the TLB and not have a read after write (RAW) hazard. For each unresolved load, the pick queue tracks dependent instructions via dependency vectors based upon the dependency matrix. If a load speculation is found to be incorrect, dependent instructions in the pick queue are reset to allow for subsequent picking, and dependent instructions in flight are canceled. On completion of a load miss, dependent operations are re-issued. On resolution of a TLB miss or RAW hazard, the original load is replayed and dependent operations are issued again from the pick queue.
    • 用于识别处理器中推测加载操作的依赖指令的系统和方法。 处理器为解码和重新命名的指令分配统一挑选队列的条目。 相应的依赖矩阵的每个条目被配置为在拾取队列中存储每个其他指令的依赖位。 处理器推测负载将在数据高速缓存中击中,在TLB中触发,写入(RAW)危险后不会有读取。 对于每个未解决的负载,拾取队列基于依赖矩阵通过依赖向量跟踪相关指令。 如果发现负载推测不正确,则选择队列中的相关指令将被重置,以允许随后的拣配,并取消飞行中的相关指令。 完成负载错误后,重新发行依赖操作。 在解决TLB错误或RAW危险时,将重新起始原始负载,并从拾取队列再次发出依赖操作。
    • 4. 发明申请
    • DEPENDENCY MATRIX FOR THE DETERMINATION OF LOAD DEPENDENCIES
    • 用于确定负载依赖性的依赖矩阵
    • US20100332806A1
    • 2010-12-30
    • US12495025
    • 2009-06-30
    • Robert T. GollaMatthew B. SmittleXiang Shan Li
    • Robert T. GollaMatthew B. SmittleXiang Shan Li
    • G06F9/30
    • G06F9/3842G06F9/3838G06F9/3851G06F9/3855G06F9/3857G06F9/3861
    • Systems and methods for identification of dependent instructions on speculative load operations in a processor. A processor allocates entries of a unified pick queue for decoded and renamed instructions. Each entry of a corresponding dependency matrix is configured to store a dependency bit for each other instruction in the pick queue. The processor speculates that loads will hit in the data cache, hit in the TLB and not have a read after write (RAW) hazard. For each unresolved load, the pick queue tracks dependent instructions via dependency vectors based upon the dependency matrix. If a load speculation is found to be incorrect, dependent instructions in the pick queue are reset to allow for subsequent picking, and dependent instructions in flight are canceled. On completion of a load miss, dependent operations are re-issued. On resolution of a TLB miss or RAW hazard, the original load is replayed and dependent operations are issued again from the pick queue.
    • 用于识别处理器中推测加载操作的依赖指令的系统和方法。 处理器为解码和重新命名的指令分配统一挑选队列的条目。 相应的依赖矩阵的每个条目被配置为在拾取队列中存储每个其他指令的依赖位。 处理器推测负载将在数据高速缓存中击中,在TLB中触发,写入(RAW)危险后不会有读取。 对于每个未解决的负载,拾取队列基于依赖矩阵通过依赖向量跟踪相关指令。 如果发现负载推测不正确,则选择队列中的相关指令将被重置,以允许随后的拣配,并取消飞行中的相关指令。 完成负载错误后,重新发行依赖操作。 在解决TLB错误或RAW危险时,将重新起始原始负载,并从拾取队列再次发出依赖操作。
    • 5. 发明申请
    • DYNAMIC MITIGATION OF THREAD HOGS ON A THREADED PROCESSOR
    • 螺纹加工器上的螺纹动作的动态减速
    • US20110029978A1
    • 2011-02-03
    • US12511620
    • 2009-07-29
    • Jared C. SmolensRobert T. GollaMatthew B. Smittle
    • Jared C. SmolensRobert T. GollaMatthew B. Smittle
    • G06F9/46
    • G06F9/5016G06F2209/504G06F2209/507Y02D10/22
    • Systems and methods for efficient thread arbitration in a processor. A processor comprises a multi-threaded resource. The resource may include an array 8of entries which may be allocated by threads. A thread arbitration table corresponding to a given thread stores a high and a low threshold value in each table entry. A thread history shift register (HSR) indexes the table, wherein each bit of the HSR indicates whether the given thread is a thread hog. When the given thread has more allocated entries in the array than the high threshold of the table entry, the given thread is stalled from further allocating array entries. Similarly, when the given thread has fewer allocated entries in the array than the low threshold of the selected table entry, the given thread is permitted to allocate entries. In this manner, threads that hog dynamic resources can be mitigated such that more resources are available to other threads that are not thread hogs. This can result in a significant increase in overall processor performance.
    • 处理器中有效线程仲裁的系统和方法。 处理器包括多线程资源。 资源可以包括可由线程分配的条目的数组8。 对应于给定线程的线程仲裁表在每个表条目中存储高和低阈值。 线程历史移位寄存器(HSR)对表进行索引,其中HSR的每个位指示给定线程是否是线程号。 当给定的线程在数组中具有比表条目的高阈值更多的分配条目时,给定线程从进一步分配数组条目停止。 类似地,当给定的线程在数组中的分配的条目少于所选表条目的低阈值时,允许给定的线程分配条目。 以这种方式,可以减轻动态资源的线程,使得更多的资源可用于不是线程的其他线程。 这可能导致整体处理器性能的显着增加。
    • 6. 发明申请
    • OPTIMAL DEALLOCATION OF INSTRUCTIONS FROM A UNIFIED PICK QUEUE
    • 从一个统一的PICK QUEUE的指示的最佳决定
    • US20110078697A1
    • 2011-03-31
    • US12571200
    • 2009-09-30
    • Matthew B. SmittleRobert T. Golla
    • Matthew B. SmittleRobert T. Golla
    • G06F9/46
    • G06F9/3836G06F9/3838G06F9/384G06F9/3842G06F9/3851G06F9/3857
    • Systems and methods for efficient out-of-order dynamic deallocation of entries within a shared storage resource in a processor. A processor comprises a unified pick queue that includes an array configured to dynamically allocate any entry of a plurality of entries for a decoded and renamed instruction. This instruction may correspond to any available active threads supported by the processor. The processor includes circuitry configured to determine whether an instruction corresponding to an allocated entry of the plurality of entries is dependent on a speculative instruction and whether the instruction has a fixed instruction execution latency. In response to determining the instruction is not dependent on a speculative instruction, the instruction has a fixed instruction execution latency, and said latency has transpired, the circuitry may deallocate the instruction from the allocated entry.
    • 用于处理器中共享存储资源内的条目的有效无序动态释放的系统和方法。 处理器包括统一选择队列,其包括被配置为动态分配用于解码和重命名指令的多个条目的任何条目的阵列。 该指令可对应于处理器支持的任何可用的活动线程。 所述处理器包括被配置为确定与所述多个条目中所分配的条目相对应的指令是否取决于推测指令以及所述指令是否具有固定指令执行等待时间的电路。 响应于确定指令不依赖于推测性指令,指令具有固定的指令执行延迟,并且所述等待时间已经发生,电路可能从分配的条目释放指令。
    • 7. 发明授权
    • Dynamic mitigation of thread hogs on a threaded processor
    • 在线程处理器上线性猪的动态减轻
    • US08347309B2
    • 2013-01-01
    • US12511620
    • 2009-07-29
    • Jared C. SmolensRobert T. GollaMatthew B. Smittle
    • Jared C. SmolensRobert T. GollaMatthew B. Smittle
    • G06F9/46G06F9/30
    • G06F9/5016G06F2209/504G06F2209/507Y02D10/22
    • Systems and methods for efficient thread arbitration in a processor. A processor comprises a multi-threaded resource. The resource may include an array of entries which may be allocated by threads. A thread arbitration table corresponding to a given thread stores a high and a low threshold value in each table entry. A thread history shift register (HSR) indexes the table, wherein each bit of the HSR indicates whether the given thread is a thread hog. When the given thread has more allocated entries in the array than the high threshold of the table entry, the given thread is stalled from further allocating array entries. Similarly, when the given thread has fewer allocated entries in the array than the low threshold of the selected table entry, the given thread is permitted to allocate entries. In this manner, threads that hog dynamic resources can be mitigated such that more resources are available to other threads that are not thread hogs. This can result in a significant increase in overall processor performance.
    • 处理器中有效线程仲裁的系统和方法。 处理器包括多线程资源。 资源可以包括可由线程分配的条目数组。 对应于给定线程的线程仲裁表在每个表条目中存储高和低阈值。 线程历史移位寄存器(HSR)对表进行索引,其中HSR的每个位指示给定线程是否是线程号。 当给定的线程在数组中具有比表条目的高阈值更多的分配条目时,给定线程从进一步分配数组条目停止。 类似地,当给定的线程在数组中的分配的条目少于所选表条目的低阈值时,允许给定的线程分配条目。 以这种方式,可以减轻动态资源的线程,使得更多的资源可用于不是线程的其他线程。 这可能导致整体处理器性能的显着增加。
    • 8. 发明授权
    • Optimal deallocation of instructions from a unified pick queue
    • 从统一的拣选队列中优化解除指令
    • US09286075B2
    • 2016-03-15
    • US12571200
    • 2009-09-30
    • Matthew B. SmittleRobert T. Golla
    • Matthew B. SmittleRobert T. Golla
    • G06F9/38
    • G06F9/3836G06F9/3838G06F9/384G06F9/3842G06F9/3851G06F9/3857
    • Systems and methods for efficient out-of-order dynamic deallocation of entries within a shared storage resource in a processor. A processor comprises a unified pick queue that includes an array configured to dynamically allocate any entry of a plurality of entries for a decoded and renamed instruction. This instruction may correspond to any available active threads supported by the processor. The processor includes circuitry configured to determine whether an instruction corresponding to an allocated entry of the plurality of entries is dependent on a speculative instruction and whether the instruction has a fixed instruction execution latency. In response to determining the instruction is not dependent on a speculative instruction, the instruction has a fixed instruction execution latency, and said latency has transpired, the circuitry may deallocate the instruction from the allocated entry.
    • 用于处理器中共享存储资源内的条目的有效无序动态释放的系统和方法。 处理器包括统一选择队列,其包括被配置为动态分配用于解码和重命名指令的多个条目的任何条目的阵列。 该指令可以对应于处理器支持的任何可用的活动线程。 所述处理器包括被配置为确定与所述多个条目中所分配的条目相对应的指令是否取决于推测指令以及所述指令是否具有固定指令执行等待时间的电路。 响应于确定指令不依赖于推测性指令,指令具有固定的指令执行延迟,并且所述等待时间已经发生,电路可能从分配的条目释放指令。
    • 9. 发明授权
    • Processor operating mode for mitigating dependency conditions between instructions having different operand sizes
    • 用于缓解具有不同操作数大小的指令之间的依赖条件的处理器操作模式
    • US08504805B2
    • 2013-08-06
    • US12428464
    • 2009-04-22
    • Robert T. GollaPaul J. JordanJama I. BarrehMatthew B. SmittleYuan C. ChouJared C. Smolens
    • Robert T. GollaPaul J. JordanJama I. BarrehMatthew B. SmittleYuan C. ChouJared C. Smolens
    • G06F7/483
    • G06F9/3838G06F9/30032G06F9/30109G06F9/30189
    • Various techniques for mitigating dependencies between groups of instructions are disclosed. In one embodiment, such dependencies include “evil twin” conditions, in which a first floating-point instruction has as a destination a first portion of a logical floating-point register (e.g., a single-precision write), and in which a second, subsequent floating-point instruction has as a source the first portion and a second portion of the same logical floating-point register (e.g., a double-precision read). The disclosed techniques may be applicable in a multithreaded processor implementing register renaming. In one embodiment, a processor may enter an operating mode in which detection of evil twin “producers” (e.g., single-precision writes) causes the instruction sequence to be modified to break potential dependencies. Modification of the instruction sequence may continue until one or more exit criteria are reached (e.g., committing a predetermined number of single-precision writes). This operating mode may be employed on a per-thread basis.
    • 公开了用于减轻指令组之间依赖性的各种技术。 在一个实施例中,这种依赖性包括“恶双”条件,其中第一浮点指令具有作为目的地的逻辑浮点寄存器的第一部分(例如,单精度写入),并且其中第二浮点指令 后续浮点指令作为源的相同逻辑浮点寄存器的第一部分和第二部分(例如,双精度读取)。 所公开的技术可以适用于实现寄存器重命名的多线程处理器。 在一个实施例中,处理器可以进入操作模式,在该操作模式中,恶意孪生“生产者”(例如,单精度写入)的检测导致指令序列被修改以破坏潜在依赖性。 指令序列的修改可以继续,直到达到一个或多个退出标准(例如,提交预定数量的单精度写入)。 该操作模式可以在每个线程的基础上使用。
    • 10. 发明申请
    • DIVISION UNIT WITH MULTIPLE DIVIDE ENGINES
    • 具有多个引擎的部门
    • US20130179664A1
    • 2013-07-11
    • US13345391
    • 2012-01-06
    • Christopher H. OlsonJeffrey S. BrooksMatthew B. Smittle
    • Christopher H. OlsonJeffrey S. BrooksMatthew B. Smittle
    • G06F7/487G06F9/38G06F7/537G06F9/302G06F5/01G06F9/30
    • G06F9/3895G06F7/49936G06F7/535G06F7/5375G06F9/3001G06F9/3875G06F9/3885
    • Techniques are disclosed relating to integrated circuits that include hardware support for divide and/or square root operations. In one embodiment, an integrated circuit is disclosed that includes a division unit that, in turn, includes a normalization circuit and a plurality of divide engines. The normalization circuit is configured to normalize a set of operands. Each divide engine is configured to operate on a respective normalized set of operands received from the normalization circuit. In some embodiments, the integrated circuit includes a scheduler unit configured to select instructions for issuance to a plurality of execution units including the division unit. The scheduler unit is further configured to maintain a counter indicative of a number of instructions currently being operated on by the division unit, and to determine, based on the counter whether to schedule subsequent instructions for issuance to the division unit.
    • 公开了涉及包括用于划分和/或平方根操作的硬件支持的集成电路的技术。 在一个实施例中,公开了一种集成电路,其包括分割单元,该分割单元又包括归一化电路和多个除法引擎。 归一化电路被配置为归一化一组操作数。 每个分频引擎被配置为对从归一化电路接收的相应的归一化操作数集进行操作。 在一些实施例中,集成电路包括调度器单元,其被配置为选择用于向包括该分割单元的多个执行单元发布的指令。 调度器单元还被配置为保持指示当前正在由分割单元操作的指令的数量的计数器,并且基于计数器确定是否计划用于发布到分割单元的后续指令。