会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 13. 发明授权
    • Reduced register-dependency checking for paired-instruction dispatch in
a superscalar processor with partial register writes
    • 在具有部分寄存器写入的超标量处理器中减少了配对指令调度的寄存器依赖性检查
    • US5790826A
    • 1998-08-04
    • US618636
    • 1996-03-19
    • Shalesh ThusooGene ShenJames S. Blomgren
    • Shalesh ThusooGene ShenJames S. Blomgren
    • G06F9/30G06F9/312G06F9/38G06F9/28
    • G06F9/30043G06F9/30112G06F9/3824G06F9/3834G06F9/3836G06F9/3857
    • The dispatch unit of a superscalar processor checks for register dependencies among instructions to be issued together as a group. The first instruction's destination register is compared to the following instructions' sources, but the destinations of following instructions are not checked with the first instruction's destination. Instead, instructions with destination-destination dependencies are dispatched together as a group. These instructions flow down the pipelines. At the end of the pipelines the destinations are compared. If the destinations match then the results are merged together and written to the register. When instructions write to only a portion of the register, merging ensures that the correct portions of the register are written by the appropriate instructions in the group. Thus older code which performs partial-register writes can benefit from superscalar processing by dispatching the instructions together as a group and then merging the writes together at the end of the pipelines. The dispatch and decode stage, which is often a critical path on the processor, is reduced in complexity by not checking for destination-register dependencies. Performance increases because more kinds of instructions can be dispatched together in a group, increasing the use of the superscalar features.
    • 超标量处理器的调度单元检查要作为一组发放在一起的指令之间的寄存器依赖性。 第一个指令的目的地寄存器与以下指令的源进行比较,但是第一个指令的目的地不检查以下指令的目的地。 相反,具有目的地 - 目的地依赖关系的指令一起作为一组分派。 这些指令沿着管道流下。 在管道末端比较目的地。 如果目的地匹配,则结果合并在一起并写入寄存器。 当指令仅写入寄存器的一部分时,合并确保寄存器的正确部分由组中的相应指令写入。 因此,执行部分寄存器写入的较旧的代码可以通过将指令一起发送为一组然后在管道末端合并在一起的超标量处理而受益。 调度和解码阶段(通常是处理器上的关键路径)通过不检查目标寄存器依赖关系来降低复杂度。 性能提高,因为可以在一组中一起调度更多种类的指令,从而增加超标量特征的使用。
    • 14. 发明授权
    • Mixed-modulo address generation using shadow segment registers
    • 使用影子段寄存器的混合模地址生成
    • US5790443A
    • 1998-08-04
    • US618632
    • 1996-03-19
    • Gene ShenShalesh ThusooJames S. BlomgrenBetty Kikuta
    • Gene ShenShalesh ThusooJames S. BlomgrenBetty Kikuta
    • G06F7/50G06F7/509G06F9/30G06F9/355G06F12/02G06F7/38G06F9/26
    • G06F9/30116G06F12/0292G06F7/509G06F9/3013G06F9/355G06F9/3552G06F7/49931G06F7/49994
    • A mixed-modulo address generation unit has several inputs. The unit effectively adds together a subset of these inputs in a reduced modulus while simultaneously adding other inputs in a full modulus to the partial sum of reduced-modulus inputs. The subset of inputs receives reduced-width address components such as 16-bit address components which are effectively added together in modulo 64K. The other inputs receive full-width address components such as 32-bit components which are added in the full modulus, 4G. Reduced-width components are zero-extended to 32 bits before input to a standard 32-bit adder. A 16-bit carry generator also receives the reduced-width components and generates the carries out of the 16th bit position. When one or more carries is detected, a correction term is subtracted from the initial sum which is recirculated to the adder's input in a subsequent step. The correction term is the number of carries out of the 16th bit position multiplied by 64K. The full-width segment bases for all active segments are stored in the register file, but the most commonly accessed segments, the data and stack segments, have a copy of their segment bases also stored in a shadow register for input to the adder. Thus the number of read ports to the register file is reduced by the shadow segment register. Less-frequently-used segments require an additional step through the adder to generate the address, but addresses in the data and stack segments are generated in a single cycle.
    • 混合模地址生成单元具有多个输入。 该单元有效地将减少的模数中的这些输入的子集合在一起,同时将全模数的其他输入添加到减模量输入的部分和。 输入子集接收减少宽度的地址组件,例如16位地址组件,这些组件以64K模式实际加在一起。 其他输入接收全宽地址组件,例如以完全模数4G格式添加的32位组件。 在输入到标准32位加法器之前,缩减宽度分量将零扩展到32位。 16位进位发生器还接收减小宽度分量并产生第16位位置的执行。 当检测到一个或多个载波时,在随后的步骤中从初始和减去校正项,该初始和再循环到加法器的输入。 校正项是执行第16位位乘以64K的次数。 所有活动段的全宽段基准存储在寄存器文件中,但是最常访问的段,数据和堆栈段都具有其段基准的副本,也存储在阴影寄存器中,以输入加法器。 因此,通过影子段寄存器减少寄存器文件的读端口数。 较不频繁使用的分段需要通过加法器的附加步骤来生成地址,但是在单个周期中生成数据和堆栈段中的地址。
    • 17. 发明授权
    • Processing pipeline having stage-specific thread selection and method thereof
    • 具有阶段特定线程选择的处理管线及其方法
    • US08086825B2
    • 2011-12-27
    • US11967923
    • 2007-12-31
    • Gene ShenSean LieMarius Evers
    • Gene ShenSean LieMarius Evers
    • G06F9/38G06F9/48
    • G06F9/3867G06F9/3851G06F9/3891
    • One or more processor cores of a multiple-core processing device each can utilize a processing pipeline having a plurality of execution units (e.g., integer execution units or floating point units) that together share a pre-execution front-end having instruction fetch, decode and dispatch resources. Further, one or more of the processor cores each can implement dispatch resources configured to dispatch multiple instructions in parallel to multiple corresponding execution units via separate dispatch buses. The dispatch resources further can opportunistically decode and dispatch instruction operations from multiple threads in parallel so as to increase the dispatch bandwidth. Moreover, some or all of the stages of the processing pipelines of one or more of the processor cores can be configured to implement independent thread selection for the corresponding stage.
    • 多核处理设备的一个或多个处理器核心可以利用具有多个执行单元(例如,整数执行单元或浮点单元)的处理流水线,这些执行单元共同共享具有指令获取的前执行前端,解码 并派遣资源。 此外,一个或多个处理器核心可以实现调度资源,配置为通过分开的调度总线并行分配多个相应执行单元的多个指令。 调度资源还可以并行地从多个线程机会地解码和分派指令操作,以增加调度带宽。 此外,一个或多个处理器核心的处理流水线的一些或所有阶段可被配置为实现相应阶段的独立线程选择。
    • 18. 发明授权
    • Processing pipeline having parallel dispatch and method thereof
    • 具有并行调度的处理流水线及其方法
    • US07793080B2
    • 2010-09-07
    • US11967924
    • 2007-12-31
    • Gene ShenSean Lie
    • Gene ShenSean Lie
    • G06F9/30
    • G06F9/3885G06F9/3822G06F9/3842G06F9/3851
    • One or more processor cores of a multiple-core processing device each can utilize a processing pipeline having a plurality of execution units (e.g., integer execution units or floating point units) that together share a pre-execution front-end having instruction fetch, decode and dispatch resources. Further, one or more of the processor cores each can implement dispatch resources configured to dispatch multiple instructions in parallel to multiple corresponding execution units via separate dispatch buses. The dispatch resources further can opportunistically decode and dispatch instruction operations from multiple threads in parallel so as to increase the dispatch bandwidth. Moreover, some or all of the stages of the processing pipelines of one or more of the processor cores can be configured to implement independent thread selection for the corresponding stage.
    • 多核处理设备的一个或多个处理器核心可以利用具有多个执行单元(例如,整数执行单元或浮点单元)的处理流水线,这些执行单元共同共享具有指令获取的解码器的前执行前端 并派遣资源。 此外,一个或多个处理器核心可以实现调度资源,配置为通过分开的调度总线并行分配多个相应执行单元的多个指令。 调度资源还可以并行地从多个线程机会地解码和分派指令操作,以增加调度带宽。 此外,一个或多个处理器核心的处理流水线的一些或所有阶段可被配置为实现相应阶段的独立线程选择。