会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明授权
    • Circuit and method for instruction compression and dispersal in wide-issue processors
    • 广泛处理器中指令压缩和扩散的电路和方法
    • US07143268B2
    • 2006-11-28
    • US09751674
    • 2000-12-29
    • Paolo FaraboschiAnthony X. JarvisMark Owen HomewoodGeoffrey M. BrownGary L. Vondran
    • Paolo FaraboschiAnthony X. JarvisMark Owen HomewoodGeoffrey M. BrownGary L. Vondran
    • G06F9/30
    • G06F9/3891G06F9/30054G06F9/30101G06F9/3802G06F9/3853G06F9/3885
    • A data processor includes execution clusters, an instruction cache, an instruction issue unit, and alignment and dispersal circuitry. Each execution cluster includes an instruction execution pipeline having a number of processing stages, and each execution pipeline is a number of lanes wide. The processing stages execute instruction bundles, where each instruction bundle has one or more syllables. Each lane is capable of receiving one of the syllables of an instruction bundle. The instruction cache includes a number of cache lines. The instruction issue unit receives fetched cache lines and issues complete instruction bundles toward the execution clusters. The alignment and dispersal circuitry receives the complete instruction bundles from the instruction issue unit and routes each received complete instruction bundle to a correct one of the execution clusters. The complete instruction bundles are routed as a function of at least one address bit associated with each complete instruction bundle.
    • 数据处理器包括执行集群,指令高速缓存,指令发布单元以及对准和分散电路。 每个执行集群包括具有多个处理阶段的指令执行流水线,并且每个执行流水线都是多个通道。 处理阶段执行指令束,其中每个指令束具有一个或多个音节。 每个通道能够接收指令束的一个音节。 指令高速缓存包含多条缓存行。 指令发布单元接收获取的高速缓存行并向执行群发出完整的指令束。 对齐和分散电路从指令发布单元接收完整的指令束,并将每个接收的完整指令包路由到正确的一个执行簇。 完整的指令束作为与每个完整指令束相关联的至少一个地址位的功能进行路由。
    • 6. 发明授权
    • System and method for executing variable latency load operations in a date processor
    • 在日期处理器中执行可变延迟加载操作的系统和方法
    • US07757066B2
    • 2010-07-13
    • US09751372
    • 2000-12-29
    • Anthony X. JarvisPaolo Faraboschi
    • Anthony X. JarvisPaolo Faraboschi
    • G06F9/30
    • G06F9/3867G06F9/30043G06F9/3824G06F9/3826G06F9/3828G06F9/3885G06F9/3891G06F12/0855
    • There is disclosed a data processor that executes variable latency load operations using bypass circuitry that allows load word operations to avoid stalls caused by shifting circuitry. The processor comprises: 1) an instruction execution pipeline comprising N processing stages, each of the N processing stages for performing one of a plurality of execution steps associated with a pending instruction being executed by the instruction execution pipeline; 2) a data cache for storing data values used by the pending instruction; 3) a plurality of registers for receiving the data values from the data cache; 4) a load store unit for transferring a first one of the data values from the data cache to a target one of the plurality of registers during execution of a load operation; 5) a shifter circuit associated with the load store unit for shifting the first data value prior to loading the first data value into the target register; and 6) bypass circuitry associated with the load store unit for transferring the first data value from the data cache directly to the target register without processing the first data value in the shifter circuit.
    • 公开了一种使用旁路电路执行可变等待时间负载操作的数据处理器,其允许加载字操作以避免由移位电路引起的停顿。 所述处理器包括:1)包括N个处理级的指令执行流水线,所述N个处理级中的每一个执行与由所述指令执行管线执行的待决指令相关联的多个执行步骤之一; 2)用于存储待决指令使用的数据值的数据高速缓存; 3)用于从数据高速缓存接收数据值的多个寄存器; 4)一种加载存储单元,用于在执行加载操作期间将数据值中的第一个数据值从数据高速缓存传送到多个寄存器中的目标寄存器; 5)与加载存储单元相关联的移位器电路,用于在将第一数据值加载到目标寄存器之前移位第一数据值; 和6)旁路与加载存储单元相关联的电路,用于将第一数据值从数据高速缓存直接传送到目标寄存器,而不处理移位器电路中的第一数据值。
    • 7. 发明申请
    • REPLAY OF DETECTED PATTERNS IN PREDICTED INSTRUCTIONS
    • 检测图案在预测指示中的重置
    • US20120117362A1
    • 2012-05-10
    • US12943859
    • 2010-11-10
    • Ravindra N. BhargavaDavid SuggsAnthony X. Jarvis
    • Ravindra N. BhargavaDavid SuggsAnthony X. Jarvis
    • G06F9/38
    • G06F9/3848G06F9/381
    • Techniques are disclosed relating to improving the performance of branch prediction in processors. In one embodiment, a processor is disclosed that includes a branch prediction unit configured to predict a sequence of instructions to be issued by the processor for execution. The processor also includes a pattern detection unit configured to detect a pattern in the predicted sequence of instructions, where the pattern includes a plurality of predicted instructions. In response to the pattern detection unit detecting the pattern, the processor is configured to switch from issuing instructions predicted by the branch prediction unit to issuing the plurality of instructions. In some embodiments, the processor includes a replay unit that is configured to replay fetch addresses to an instruction fetch unit to cause the plurality of predicted instructions to be issued.
    • 公开了关于改善处理器中分支预测的性能的技术。 在一个实施例中,公开了一种处理器,其包括分支预测单元,其被配置为预测要由处理器发出的用于执行的指令序列。 所述处理器还包括:图案检测单元,被配置为检测所述预测指令序列中的图案,其中所述图案包括多个预测指令。 响应于图案检测单元检测图案,处理器被配置为从由分支预测单元预测的发布指令切换到发出多个指令。 在一些实施例中,处理器包括重播单元,其被配置为将取指地址重播到指令提取单元以使得发出多个预测指令。
    • 8. 发明授权
    • Hybrid branch prediction device with sparse and dense prediction caches
    • 具有稀疏密集预测缓存的混合分支预测装置
    • US08181005B2
    • 2012-05-15
    • US12205429
    • 2008-09-05
    • Gerald D. Zuraski, Jr.James D. DundasAnthony X. Jarvis
    • Gerald D. Zuraski, Jr.James D. DundasAnthony X. Jarvis
    • G06F9/32G06F9/38
    • G06F9/3844G06F9/3806
    • A system and method for branch prediction in a microprocessor. A hybrid device stores branch prediction information in a sparse cache for no more than a common smaller number of branches within each entry of the instruction cache. For the less common case wherein an i-cache line comprises additional branches, the device stores the corresponding branch prediction information in a dense cache. Each entry of the sparse cache stores a bit vector indicating whether or not a corresponding instruction cache line includes additional branch instructions. This indication may also be used to select an entry in the dense cache for storage. A second sparse cache stores entire evicted entries from the first sparse cache.
    • 一种用于微处理器中分支预测的系统和方法。 混合设备将稀疏高速缓存中的分支预测信息存储在指令高速缓存的每个条目内不超过公共较小数量的分支。 对于i-cache行包括附加分支的较不常见的情况,该设备将相应的分支预测信息存储在密集高速缓存中。 稀疏高速缓存的每个条目存储指示对应的指令高速缓存行是否包括附加分支指令的位向量。 此指示也可用于选择密集缓存中的条目以进行存储。 第二个稀疏缓存存储从第一个稀疏高速缓存中的所有被驱逐的条目。
    • 9. 发明申请
    • CLASSIFYING AND SEGREGATING BRANCH TARGETS
    • 分类和分散分支目标
    • US20110093658A1
    • 2011-04-21
    • US12581878
    • 2009-10-19
    • Gerald D. Zuraski, JR.James D. DundasAnthony X. Jarvis
    • Gerald D. Zuraski, JR.James D. DundasAnthony X. Jarvis
    • G06F9/38G06F12/08
    • G06F9/3844G06F9/3806
    • A system and method for branch prediction in a microprocessor. A branch prediction unit stores an indication of a location of a branch target instruction relative to its corresponding branch instruction. For example, a target instruction may be located within a first region of memory as a branch instruction. Alternatively, the target instruction may be located outside the first region, but within a larger second region. The prediction unit comprises a branch target array corresponding to each region. Each array stores a bit range of a branch target address, wherein the stored bit range is based upon the location of the target instruction relative to the branch instruction. The prediction unit constructs a predicted branch target address by concatenating a bits stored in the branch target arrays.
    • 一种用于微处理器中分支预测的系统和方法。 分支预测单元相对于其相应的分支指令存储分支目标指令的位置的指示。 例如,目标指令可以作为分支指令位于存储器的第一区域内。 或者,目标指令可以位于第一区域的外部,但在较大的第二区域内。 预测单元包括对应于每个区域的分支目标阵列。 每个阵列存储分支目标地址的比特范围,其中存储的比特范围基于目标指令相对于分支指令的位置。 预测单元通过连接存储在分支目标数组中的比特来构建预测分支目标地址。
    • 10. 发明授权
    • Bypass circuitry for use in a pipelined processor
    • US07093107B2
    • 2006-08-15
    • US09751377
    • 2000-12-29
    • Anthony X. Jarvis
    • Anthony X. Jarvis
    • G06F15/00
    • G06F9/3013G06F9/3828G06F9/3857G06F9/3891
    • There is disclosed a data processor that uses bypass circuitry to transfer result data from late pipeline stages to earlier pipeline stages in an efficient manner and with a minimum amount of wiring. The data processor comprises: 1) an instruction execution pipeline comprising a) a read stage; b) a write stage; and c) a first execution stage comprising E execution units that produce data results from data operands. The data processor also comprises: 2) a register file comprising a plurality of data registers, each of the data registers being read by the read stage of the instruction pipeline via at least one of R read ports of the register file and each of the data registers being written by the write stage of the instruction pipeline via at least one of W write ports of the register file; and 3) bypass circuitry for receiving data results from output channels of source devices in at least one of the write stage and the first execution stage, the bypass circuitry comprising a first plurality of bypass tristate line drivers having input channels coupled to first output channels of a first plurality of source devices and tristate output channels coupled to a first common read data channel in the read stage.