会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 62. 发明授权
    • Microprocessor with ALU integrated into load unit
    • 具有ALU的微处理器集成到负载单元中
    • US09501286B2
    • 2016-11-22
    • US12609169
    • 2009-10-30
    • Gerard M. ColColin EddyRodney E. Hooker
    • Gerard M. ColColin EddyRodney E. Hooker
    • G06F9/38G06F9/30G06F12/08
    • G06F9/3875G06F9/3001G06F9/3004G06F9/30043G06F9/30145G06F9/3017G06F9/3893G06F12/0875
    • A superscalar pipelined microprocessor includes a register set defined by its instruction set architecture, a cache memory, execution units, and a load unit, coupled to the cache memory and distinct from the other execution units. The load unit comprises an ALU. The load unit receives an instruction that specifies a memory address of a source operand, an operation to be performed on the source operand to generate a result, and a destination register of the register set to which the result is to be stored. The load unit reads the source operand from the cache memory. The ALU performs the operation on the source operand to generate the result, rather than forwarding the source operand to any of the other execution units of the microprocessor to perform the operation on the source operand to generate the result. The load unit outputs the result for subsequent retirement to the destination register.
    • 超标量流水线微处理器包括由其指令集架构定义的寄存器组,高速缓冲存储器,执行单元和负载单元,耦合到高速缓冲存储器并且与其他执行单元不同。 负载单元包括一个ALU。 加载单元接收指定源操作数的存储器地址的指令,要在源操作数上执行的用于生成结果的操作以及要存储结果的寄存器集的目标寄存器。 加载单元从缓存中读取源操作数。 ALU对源操作数执行操作以生成结果,而不是将源操作数转发到微处理器的任何其他执行单元,以对源操作数执行操作以生成结果。 加载单元将结果退出到目的地寄存器。
    • 63. 发明授权
    • Vector matrix product accelerator for microprocessor integration
    • 用于微处理器集成的矢量矩阵乘积加速器
    • US09384168B2
    • 2016-07-05
    • US13914731
    • 2013-06-11
    • Analog Devices Global
    • Mikael Mortensen
    • G06F17/16G06F9/30G06F9/38
    • G06F17/16G06F9/3001G06F9/30036G06F9/3824G06F9/3893
    • In at least one example embodiment, a microprocessor circuit is provided that includes a microprocessor core coupled to a data memory via a data memory bus comprising a predetermined integer number of data wires (J); the single-ported data memory configured for storage of vector input elements of an N element vector in a predetermined vector element order and storage of matrix input elements of an M×N matrix comprising M columns of matrix input elements and N rows of matrix input elements; a vector matrix product accelerator comprising a datapath configured for multiplying the N element vector and the matrix to compute an M element result vector, the vector matrix product accelerator comprising: an input/output port interfacing the data memory bus to the vector matrix product accelerator; a plurality of vector input registers for storage respective input vector elements received through the input/output port.
    • 在至少一个示例性实施例中,提供了微处理器电路,其包括经由包括预定整数数据线(J)的数据存储器总线耦合到数据存储器的微处理器核心; 单端口数据存储器,其被配置为以预定向量元素顺序存储N个元素向量的向量输入元素,并存储包括M列矩阵输入元素和N行矩阵输入元素的M×N矩阵的矩阵输入元素 ; 矢量矩阵乘积加速器,其包括被配置为将所述N个元素向量和所述矩阵相乘以计算M元素结果向量的数据路径,所述向量矩阵乘积加速器包括:将所述数据存储器总线连接到所述向量矩阵乘积加速器的输入/输出端口; 多个向量输入寄存器,用于存储通过输入/输出端口接收的各个输入向量元素。
    • 67. 发明申请
    • STANDARD FORMAT INTERMEDIATE RESULT
    • 标准格式中间结果
    • US20160004506A1
    • 2016-01-07
    • US14749002
    • 2015-06-24
    • VIA ALLIANCE SEMICONDUCTOR CO, LTD.
    • THOMAS ELMER
    • G06F7/483G06F9/30G06F9/38G06F7/544
    • G06F7/483G06F7/485G06F7/4876G06F7/49957G06F7/5443G06F9/3001G06F9/30014G06F9/3017G06F9/30185G06F9/38G06F9/3893G06F17/16
    • A microprocessor comprises an instruction pipeline, a shared memory, and first and second arithmetic processing units in the instruction pipeline, each capable of reading or receiving operands from and writing or providing results to the shared memory. The first arithmetic processing unit performs a first portion of a mathematical operation to produce an intermediate result vector that is not a complete, final result of the mathematical operation. The first arithmetic processing unit generates a plurality of non-architectural calculation control indicators that indicate how subsequent calculations to generate a final result from the intermediate result vector should proceed. The second arithmetic processing unit performs a second portion of the mathematical operation, in accordance with the calculation control indicators, to produce a complete, final result of the mathematical operation.
    • 微处理器包括指令流水线,共享存储器以及指令流水线中的第一和第二算术处理单元,每个能够读取或接收来自共享存储器的操作数和向其写入或提供结果。 第一算术处理单元执行数学运算的第一部分以产生不是数学运算的完整最终结果的中间结果矢量。 第一算术处理单元生成多个非架构计算控制指示符,其指示如何继续从中间结果向量生成最终结果的后续计算。 第二算术处理单元根据计算控制指示符执行​​数学运算的第二部分,以产生数学运算的完整的最终结果。
    • 68. 发明申请
    • DATA PROCESSING APPARATUS AND METHOD FOR PERFORMING SCAN OPERATIONS
    • 数据处理设备和执行扫描操作的方法
    • US20150212972A1
    • 2015-07-30
    • US14165967
    • 2014-01-28
    • ARM LIMITED
    • Matthias Lothar BOETTCHERMbou EYOLE-MONONOGiacomo GABRIELLI
    • G06F15/78G06F9/30
    • G06F15/78G06F9/3001G06F9/30036G06F9/30098G06F9/3017G06F9/3875G06F9/3887G06F9/3893
    • A data processing apparatus and method are provided for executing a vector scan instruction. The data processing apparatus comprises a vector register store configured to store vector operands, and processing circuitry configured to perform operations on vector operands retrieved from said vector register store. Further, control circuitry is configured to control the processing circuitry to perform the operations required by one or more instructions, said one or more instructions including a vector scan instruction specifying a vector operand comprising N vector elements and defining a scan operation to be performed on a sequence of vector elements within the vector operand. The control circuitry is responsive to the vector scan instruction to partition the N vector elements of the specified vector operand into P groups of adjacent vector elements, where P is between 2 and N/2, and to control the processing circuitry to perform a partitioned scan operation yielding the same result as the defined scan operation. The processing circuitry is configured to perform the partitioned scan operation by performing separate scan operations on those vector elements of the sequence contained within each group to produce intermediate results for each group, and to perform a computation operation to combine the intermediate results into a final result vector operand containing a sequence of result vector elements. The partitioned scan operation approach of the present invention enables a balance to be achieved between energy consumption and performance.
    • 提供了一种用于执行向量扫描指令的数据处理装置和方法。 数据处理装置包括被配置为存储向量操作数的向量寄存器存储器,以及被配置为对从所述向量寄存器存储器检索的向量操作数执行操作的处理电路。 此外,控制电路被配置为控制处理电路执行一个或多个指令所需的操作,所述一个或多个指令包括指定包括N个向量元素的向量操作数的向量扫描指令,并且定义要在 向量操作数中向量元素的序列。 控制电路响应于矢量扫描指令将指定矢量操作数的N个向量元素划分为相邻矢量元素的P组,其中P在2和N / 2之间,并且控制处理电路执行分区扫描 操作产生与定义的扫描操作相同的结果。 处理电路被配置为通过对包含在每个组中的序列的那些矢量元素执行单独的扫描操作来执行分割扫描操作,以产生每个组的中间结果,并且执行计算操作以将中间结果组合成最终结果 向量操作数包含一系列结果向量元素。 本发明的划分扫描操作方法能够在能量消耗和性能之间实现平衡。