会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明申请
    • DIVISION UNIT WITH MULTIPLE DIVIDE ENGINES
    • 具有多个引擎的部门
    • US20130179664A1
    • 2013-07-11
    • US13345391
    • 2012-01-06
    • Christopher H. OlsonJeffrey S. BrooksMatthew B. Smittle
    • Christopher H. OlsonJeffrey S. BrooksMatthew B. Smittle
    • G06F7/487G06F9/38G06F7/537G06F9/302G06F5/01G06F9/30
    • G06F9/3895G06F7/49936G06F7/535G06F7/5375G06F9/3001G06F9/3875G06F9/3885
    • Techniques are disclosed relating to integrated circuits that include hardware support for divide and/or square root operations. In one embodiment, an integrated circuit is disclosed that includes a division unit that, in turn, includes a normalization circuit and a plurality of divide engines. The normalization circuit is configured to normalize a set of operands. Each divide engine is configured to operate on a respective normalized set of operands received from the normalization circuit. In some embodiments, the integrated circuit includes a scheduler unit configured to select instructions for issuance to a plurality of execution units including the division unit. The scheduler unit is further configured to maintain a counter indicative of a number of instructions currently being operated on by the division unit, and to determine, based on the counter whether to schedule subsequent instructions for issuance to the division unit.
    • 公开了涉及包括用于划分和/或平方根操作的硬件支持的集成电路的技术。 在一个实施例中,公开了一种集成电路,其包括分割单元,该分割单元又包括归一化电路和多个除法引擎。 归一化电路被配置为归一化一组操作数。 每个分频引擎被配置为对从归一化电路接收的相应的归一化操作数集进行操作。 在一些实施例中,集成电路包括调度器单元,其被配置为选择用于向包括该分割单元的多个执行单元发布的指令。 调度器单元还被配置为保持指示当前正在由分割单元操作的指令的数量的计数器,并且基于计数器确定是否计划用于发布到分割单元的后续指令。
    • 3. 发明授权
    • Processor which implements fused and unfused multiply-add instructions in a pipelined manner
    • 处理器,以流水线方式实现融合和未分配的加法指令
    • US08239440B2
    • 2012-08-07
    • US12057894
    • 2008-03-28
    • Jeffrey S. BrooksChristopher H. Olson
    • Jeffrey S. BrooksChristopher H. Olson
    • G06F7/38
    • G06F7/483G06F7/5443G06F2207/3884
    • Implementing an unfused multiply-add instruction within a fused multiply-add pipeline. The system may include an aligner having an input for receiving an addition term, a multiplier tree having two inputs for receiving a first value and a second value for multiplication, and a first carry save adder (CSA), wherein the first CSA may receive partial products from the multiplier tree and an aligned addition term from the aligner. The system may include a fused/unfused multiply add (FUMA) block which may receive the first partial product, the second partial product, and the aligned addition term, wherein the first partial product and the second partial product are not truncated. The FUMA block may perform an unfused multiply add operation or a fused multiply add operation using the first partial product, the second partial product, and the aligned addition term, e.g., depending on an opcode or mode bit.
    • 在融合的乘法加法管道中实现未经加密的乘法加法指令。 系统可以包括具有用于接收加法项的输入的对准器,具有用于接收第一值的两个输入和用于乘法的第二值的乘法器树,以及第一进位保存加法器(CSA),其中第一CSA可以接收部分 乘数树中的乘积和对准器的对齐加法项。 该系统可以包括可以接收第一部分乘积,第二部分乘积和对齐的加法项的融合/未融合乘法(FUMA)块,其中第一部分乘积和第二部分乘积不被截断。 FUMA块可以使用第一部分乘积,第二部分积和对齐的相加项来执行未融合的加法运算或融合乘法运算,例如取决于操作码或模式位。
    • 6. 发明授权
    • System and method of bypassing unrounded results in a multiply-add pipeline unit
    • 在多重加法管道单元中绕过未包围结果的系统和方法
    • US08671129B2
    • 2014-03-11
    • US13043101
    • 2011-03-08
    • Jeffrey S. BrooksChristopher H. Olson
    • Jeffrey S. BrooksChristopher H. Olson
    • G06F7/32
    • G06F7/49947G06F7/483G06F7/5318G06F7/5338G06F7/5443G06F2207/3884
    • A processing unit, system, and method for performing a multiply operation in a multiply-add pipeline. To reduce the pipeline latency, the unrounded result of a multiply-add operation is bypassed to the inputs of the multiply-add pipeline for use in a subsequent operation. If it is determined that rounding is required for the prior operation, then the rounding will occur during the subsequent operation. During the subsequent operation, a Booth encoder not utilized by the multiply operation will output a rounding correction factor as a selection input to a Booth multiplexer not utilized by the multiply operation. When the Booth multiplexer receives the rounding correction factor, the Booth multiplexer will output a rounding correction value to a carry save adder (CSA) tree, and the CSA tree will generate the correct sum from the rounding correction value and the other partial products.
    • 一种用于在多重加法管线中执行乘法运算的处理单元,系统和方法。 为了减少流水线延迟,乘法运算的未包围结果被旁路到乘法加法管道的输入端,用于后续操作。 如果确定先前操作需要舍入,则在随后的操作期间将进行舍入。 在随后的操作期间,未被乘法运算使用的布斯编码器将输出舍入校正因子作为选择输入到未被乘法运算使用的布斯多路复用器。 当布斯多路复用器接收舍入校正因子时,布尔多路复用器将输出舍入校正值到进位保存加法器(CSA)树,并且CSA树将从舍入校正值和其他部分乘积生成正确的和。
    • 7. 发明申请
    • Processor Pipeline which Implements Fused and Unfused Multiply-Add Instructions
    • 处理器管道,实现融合和未填充的乘法添加说明
    • US20120221614A1
    • 2012-08-30
    • US13469212
    • 2012-05-11
    • Jeffrey S. BrooksChristopher H. Olson
    • Jeffrey S. BrooksChristopher H. Olson
    • G06F7/48
    • G06F7/483G06F7/5443G06F2207/3884
    • Implementing an unfused multiply-add instruction within a fused multiply-add pipeline. The system may include an aligner having an input for receiving an addition term, a multiplier tree having two inputs for receiving a first value and a second value for multiplication, and a first carry save adder (CSA), wherein the first CSA may receive partial products from the multiplier tree and an aligned addition term from the aligner. The system may include a fused/unfused multiply add (FUMA) block which may receive the first partial product, the second partial product, and the aligned addition term, wherein the first partial product and the second partial product are not truncated. The FUMA block may perform an unfused multiply add operation or a fused multiply add operation using the first partial product, the second partial product, and the aligned addition term, e.g., depending on an opcode or mode bit.
    • 在融合的乘法加法管道中实现未经加密的乘法加法指令。 系统可以包括具有用于接收加法项的输入的对准器,具有用于接收第一值的两个输入和用于乘法的第二值的乘法器树,以及第一进位保存加法器(CSA),其中第一CSA可以接收部分 乘数树中的乘积和对准器的对齐加法项。 该系统可以包括可以接收第一部分乘积,第二部分乘积和对齐的加法项的融合/未融合乘法(FUMA)块,其中第一部分乘积和第二部分乘积不被截断。 FUMA块可以使用第一部分乘积,第二部分积和对齐的相加项来执行未融合的加法运算或融合乘法运算,例如取决于操作码或模式位。
    • 8. 发明授权
    • Method for selecting between divide instructions associated with respective threads in a multi-threaded processor
    • 用于在多线程处理器中与相应线程相关联的除法指令之间进行选择的方法
    • US07941642B1
    • 2011-05-10
    • US10881216
    • 2004-06-30
    • Robert T. GollaJeffrey S. BrooksChristopher H. Olson
    • Robert T. GollaJeffrey S. BrooksChristopher H. Olson
    • G06F9/30
    • G06F9/3001G06F9/3851
    • In one embodiment, a multithreaded processor includes a multithreaded instruction source that may provide a plurality of instructions each corresponding to a respective one of a plurality of threads. The multithreaded processor also includes a pick unit coupled to the multithreaded instruction source. The pick unit may select in a given cycle, a first divide instruction corresponding to one thread of the plurality of threads and a second divide instruction corresponding to another thread of the plurality of threads based upon a thread selection algorithm. Further, the multithreaded processor includes a storage coupled to a functional unit including a divider configured to execute the first divide instruction and the second divide instruction. The storage may store one of the first and the second divide instructions during execution of the other of the first and the second divide instructions.
    • 在一个实施例中,多线程处理器包括多线程指令源,其可以提供多个指令,每个指令对应于多个线程中的相应一个线程。 多线程处理器还包括耦合到多线程指令源的拾取单元。 拾取单元可以在给定周期中选择对应于多个线程中的一个线程的第一除法指令和基于线程选择算法对应于多个线程中的另一线程的第二除法指令。 此外,多线程处理器包括耦合到功能单元的存储器,该功能单元包括被配置为执行第一除法指令和第二除法指令的分配器。 存储器可以在执行第一和第二除法指令中的另一个指令期间存储第一和第二除法指令之一。
    • 9. 发明申请
    • PROCESSOR AND METHOD FOR IMPLEMENTING INSTRUCTION SUPPORT FOR MULTIPLICATION OF LARGE OPERANDS
    • 用于实施大规模操作的指导性支持的处理器和方法
    • US20100325188A1
    • 2010-12-23
    • US12488372
    • 2009-06-19
    • Christopher H. OlsonJeffrey S. BrooksRobert T. GollaPaul J. Jordan
    • Christopher H. OlsonJeffrey S. BrooksRobert T. GollaPaul J. Jordan
    • G06F7/52
    • G06F7/4876G06F2207/382
    • A processor including instruction support for implementing large-operand multiplication may issue, for execution, programmer-selectable instructions from a defined instruction set architecture (ISA). The processor may include an instruction execution unit comprising a hardware multiplier datapath circuit, where the hardware multiplier datapath circuit is configured to multiply operands having a maximum number of bits M. In response to receiving a single instance of a large-operand multiplication instruction defined within the ISA, wherein at least one of the operands of the large-operand multiplication instruction includes more than the maximum number of bits M, the instruction execution unit is configured to multiply operands of the large-operand multiplication instruction within the hardware multiplier datapath circuit to determine a result of the large-operand multiplication instruction without execution of programmer-selected instructions within the ISA other than the large-operand multiplication instruction.
    • 包括用于实现大操作数乘法的指令支持的处理器可以从定义的指令集架构(ISA)发出用于执行编程器可选择指令的执行。 处理器可以包括指令执行单元,其包括硬件乘法器数据路径电路,其中硬件乘法器数据路径电路被配置为对具有最大位数M的操作数进行乘法。响应于接收到在其中定义的大操作数乘法指令的单个实例 所述ISA,其中所述大操作数乘法指令的操作数中的至少一个包括多于所述最大位数M,所述指令执行单元被配置为将所述大操作数乘法指令在所述硬件乘法器数据通路电路内的操作数乘以 确定大操作数乘法指令的结果,而不在大操作数乘法指令之外执行ISA内的编程器选择指令。