专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US06859871B1 Method and apparatus for reducing power consumption in a pipelined processor 有权
标题翻译：用于降低流水线处理器功耗的方法和装置
公开(公告)号：US06859871B1
公开(公告)日：2005-02-22
申请号：US09174936
申请日：1998-10-19
申请人： Dean Batten , Paul Gerard D'Arcy , C. John Glossner , Sanjay Jinturkar , Jesse Thilo , Kent E. Wires
发明人： Dean Batten , Paul Gerard D'Arcy , C. John Glossner , Sanjay Jinturkar , Jesse Thilo , Kent E. Wires
IPC分类号： G06F1/32 , G06F9/38 , G06F15/00
CPC分类号： G06F9/3836 , G06F1/32 , G06F9/30072 , G06F9/3838 , G06F9/3857
摘要： The invention provides techniques for reducing the power consumption of pipelined processors. In an illustrative embodiment, the invention evaluates the predicates of predicated instructions in a decode stage of a pipelined processor, and annuls instructions with false predicates before those instructions can be processed by subsequent stages, e.g, by execute and writeback stages. The predicate dependencies can be handled using, e.g., a virtual single-cycle execution technique which locks a predicate register while the register is in use by a given instruction, and then stalls subsequent instructions that depend on a value stored in the register until the register is unlocked. As another example, the predicate dependencies can be handled using a compiler-controlled dynamic dispatch (CCDD) technique, which identifies dependencies associated with a set of instructions during compilation of the instructions in a compiler. One or more instructions are then grouped in a code block which includes a field indicating the dependencies associated with those instructions, and the instructions are then, e.g., either stalled or decoded serially, based on the dependencies present in the code block. By eliminating unnecessary processing for false-predicate instructions, the invention significantly reduces the power consumption of the processor.
摘要翻译：本发明提供了用于降低流水线处理器的功耗的技术。在说明性实施例中，本发明在流水线处理器的解码级中评估预测指令的谓词，并且在这些指令可以由后续阶段（例如通过执行和回写阶段）处理之前废止具有虚假谓词的指令。可以使用例如虚拟单周期执行技术来处理谓词依赖关系，虚拟单周期执行技术在给定指令使用寄存器时锁定谓词寄存器，然后停止依赖于存储在寄存器中的值的后续指令，直到寄存器被解锁。作为另一个例子，可以使用编译器控制的动态调度（CCDD）技术处理谓词依赖关系，该技术在编译器中的指令编译期间识别与一组指令相关联的依赖关系。然后将一个或多个指令分组在代码块中，该代码块包括指示与那些指令相关联的依赖性的字段，然后基于代码块中存在的相关性，指令例如被串行地解码或解码。通过消除对虚假指令的不必要的处理，本发明显着地降低了处理器的功耗。

2. 发明授权

US06282585B1 Cooperative interconnection for reducing port pressure in clustered microprocessors 有权
标题翻译：用于降低集群微处理器端口压力的协作互连
公开(公告)号：US06282585B1
公开(公告)日：2001-08-28
申请号：US09274134
申请日：1999-03-22
申请人： Dean Batten , Paul Gerard D'Arcy , C. John Glossner , Sanjay Jinturkar , Kent E. Wires
发明人： Dean Batten , Paul Gerard D'Arcy , C. John Glossner , Sanjay Jinturkar , Kent E. Wires
IPC分类号： G06F1314
CPC分类号： G06F9/3885 , G06F9/30032 , G06F9/30072 , G06F9/3012 , G06F9/3013 , G06F9/30141 , G06F9/3824 , G06F9/3891 , G06F15/8007
摘要： The invention provides techniques for reducing the port pressure of a clustered processor. In an illustrative embodiment, the processor includes multiple clusters of execution units, with each of the clusters having a portion of a register file and a portion of a predicate file associated therewith, such that a given cluster is permitted to write to and read from its associated portions of the register and predicate files. A cooperative interconnection technique in accordance with the invention utilizes an inter-cluster move instruction specifying a source cluster and a destination cluster to copy a value from the source cluster to the destination cluster. The value is transmitted over a designated interconnect structure within the processor, and the inter-cluster move instruction is separated into two sub-instructions, one of which is executed by a unit in the source cluster, and another of which is executed by a unit in the destination cluster. These units may be, e.g., augmented ALUs or dedicated interface units within the clusters.
摘要翻译：本发明提供了用于降低集群处理器的端口压力的技术。在说明性实施例中，处理器包括执行单元的多个集群，其中每个集群具有寄存器文件的一部分和与之相关联的谓词文件的一部分，使得给定的集群被允许写入和读取它们寄存器和谓词文件的相关部分。根据本发明的协作互连技术利用指定源集群和目的集群的集群间移动指令将值从源集群复制到目的集群。该值通过处理器内的指定互连结构传输，并且群间移动指令被分成两个子指令，其中一个子指令由源簇中的单元执行，另一个由单元执行在目标集群中。这些单元可以是例如集群内的增强的ALU或专用接口单元。

3. 发明授权

US06256725B1 Shared datapath processor utilizing stack-based and register-based storage spaces 有权
标题翻译：使用基于堆栈和基于寄存器的存储空间的共享数据路径处理器
公开(公告)号：US06256725B1
公开(公告)日：2001-07-03
申请号：US09205466
申请日：1998-12-04
申请人： Dean Batten , Paul Gerard D'Arcy , C. John Glossner , Sanjay Jinturkar , Jesse Thilo , Kent E. Wires
发明人： Dean Batten , Paul Gerard D'Arcy , C. John Glossner , Sanjay Jinturkar , Jesse Thilo , Kent E. Wires
IPC分类号： G06F1202
CPC分类号： G06F9/30134 , G06F9/30185
摘要： A processor is configured to include at least two architecturally-distinct storage spaces, such as, for example, a stack for storing control operands associated with one or more instructions, and a register file for storing computational operands associated with one or more instructions. The processor further includes a datapath which is at least partially shared by the stack and register file, a multiplexer operative to select an output of either the stack or the register file for application to an input of the shared datapath, and a demultiplexer operative to select an output of the shared datapath for application to an input of either the stack or the register file. A program executed by the processor selects one of the storage spaces using, for example, a tag bit associated with a given instruction and indicating which of the storage spaces is to be used with that instruction, or a branch machine view (bmv) instruction which generates a control signal operative to select the given one of the storage spaces.
摘要翻译：处理器被配置为包括至少两个架构不同的存储空间，例如用于存储与一个或多个指令相关联的控制操作数的堆栈，以及用于存储与一个或多个指令相关联的计算操作数的寄存器文件。所述处理器还包括至少部分由所述堆栈和寄存器文件共享的数据路径，可操作以选择所述堆栈或所述寄存器文件的输出以应用于所述共享数据路径的输入的多路复用器，以及可操作以选择用于应用于堆栈或寄存器文件的输入的共享数据路径的输出。由处理器执行的程序使用例如与给定指令相关联的标签位指示存储空间中的一个，并指示要与该指令一起使用哪个存储空间，或者分支机器视图（bmv）指令产生操作以选择给定的一个存储空间的控制信号。

4. 发明授权

US06230251B1 File replication methods and apparatus for reducing port pressure in a clustered processor 有权
标题翻译：用于降低集群处理器中端口压力的文件复制方法和装置
公开(公告)号：US06230251B1
公开(公告)日：2001-05-08
申请号：US09274133
申请日：1999-03-22
申请人： Dean Batten , Paul Gerard D'Arcy , C. John Glossner , Sanjay Jinturkar , Kent E. Wires
发明人： Dean Batten , Paul Gerard D'Arcy , C. John Glossner , Sanjay Jinturkar , Kent E. Wires
IPC分类号： G06F1314
CPC分类号： G06F9/3891 , G06F9/30032 , G06F9/30072 , G06F9/3012 , G06F9/3013 , G06F9/30141 , G06F9/3824 , G06F9/3828 , G06F9/3885
摘要： The invention provides techniques for reducing the port pressure of a clustered processor. In an illustrative embodiment, the processor includes multiple clusters of execution units, with each of the clusters having a portion of a register file and a portion of a predicate file associated therewith, such that a given cluster is permitted to write to and read from its associated portions of the register and predicate files. A replication technique in accordance with the invention reduces port pressure by replicating, e.g., a register lock file and a predicate lock file of the processor for each of the clusters. The replicated files vary depending upon whether the technique is implemented with a write-only interconnection or a read-only interconnection.
摘要翻译：本发明提供了用于降低集群处理器的端口压力的技术。在说明性实施例中，处理器包括执行单元的多个集群，其中每个集群具有寄存器文件的一部分和与之相关联的谓词文件的一部分，使得给定的集群被允许写入和读取它们寄存器和谓词文件的相关部分。根据本发明的复制技术通过复制例如针对每个群集的处理器的寄存器锁定文件和谓词锁定文件来降低端口压力。复制的文件根据技术是用只写互连还是只读互连来实现。

5. 发明授权

US06269437B1 Duplicator interconnection methods and apparatus for reducing port pressure in a clustered processor 有权
标题翻译：用于降低集群处理器中端口压力的复制器互连方法和装置
公开(公告)号：US06269437B1
公开(公告)日：2001-07-31
申请号：US09274129
申请日：1999-03-22
申请人： Dean Batten , Paul Gerard D'Arcy , C. John Glossner , Sanjay Jinturkar , Kent E. Wires
发明人： Dean Batten , Paul Gerard D'Arcy , C. John Glossner , Sanjay Jinturkar , Kent E. Wires
IPC分类号： G06F1576
CPC分类号： G06F9/3891 , G06F9/30072 , G06F9/3012 , G06F9/30141 , G06F9/3824 , G06F9/3828 , G06F9/3885
摘要： The invention provides techniques for reducing the port pressure of a clustered processor. In an illustrative embodiment, the processor includes multiple clusters of execution units, with each of the clusters having a portion of a register file and a portion of a predicate file associated therewith, such that a given cluster is permitted to write to and read from its associated portions of the register and predicate files. A duplicator interconnection technique in accordance with the invention reduces port pressure by providing one or more global move units in the processor. A given global move unit uses an inter-cluster move instruction to copy a value from a portion of the register or predicate file associated with a source cluster to another portion of the register or predicate file associated with a destination cluster.
摘要翻译：本发明提供了用于降低集群处理器的端口压力的技术。在说明性实施例中，处理器包括执行单元的多个集群，其中每个集群具有寄存器文件的一部分和与其相关联的谓词文件的一部分，使得给定的集群被允许写入和读取它们寄存器和谓词文件的相关部分。根据本发明的复印机互连技术通过在处理器中提供一个或多个全局移动单元来降低端口压力。给定的全局移动单元使用群间移动指令将来自与源群集相关联的寄存器或谓词文件的一部分的值复制到与目的地群集相关联的寄存器或谓词文件的另一部分。

6. 发明授权

US06260189B1 Compiler-controlled dynamic instruction dispatch in pipelined processors 失效
标题翻译：流水线处理器中编译器控制的动态指令调度
公开(公告)号：US06260189B1
公开(公告)日：2001-07-10
申请号：US09152744
申请日：1998-09-14
申请人： Dean Batten , Paul Gerard D'Arcy , C. John Glossner , Sanjay Jinturkar , Jesse Thilo , Stamatis Vassiliadis , Kent E. Wires
发明人： Dean Batten , Paul Gerard D'Arcy , C. John Glossner , Sanjay Jinturkar , Jesse Thilo , Stamatis Vassiliadis , Kent E. Wires
IPC分类号： G06F944
CPC分类号： G06F8/4451
摘要： The invention provides techniques for improving the performance of pipelined processors by eliminating unnecessary stalling of instructions. In an illustrative embodiment, a compiler is used to identify pipeline dependencies in a given set of instructions. The compiler then groups the set of instructions into a code block having a field which indicates the types of pipeline dependencies, if any, in the set of instructions. The field may indicate the types of pipeline dependencies by specifying which of a predetermined set of hazards arise in the plurality of instructions when executed on a given pipelined processor. For example, the field may indicate whether the code block includes any Read After Write (RAW) hazards, Write After Write (WAW) hazards or Write After Read (WAR) hazards. The code block may include one or more dynamic scheduling instructions, with each of the dynamic scheduling instructions including a set of instructions for execution in a multi-issue processor.
摘要翻译：本发明提供了通过消除不必要的指令停止来提高流水线处理器的性能的技术。在说明性实施例中，使用编译器来识别给定的一组指令中的流水线依赖性。然后，编译器将该组指令组合成具有指示该组指令中的流水线依赖性类型（如果有的话）的代码块。通过在给定的流水线处理器上执行时，通过指定在多个指令中出现预定的一组危险中的哪一个来指示管道依赖性的类型。例如，该字段可以指示代码块是否包括任何读写后（RAW）危险，写入写入（WAW）危险或读取后写入（WAR）危险。代码块可以包括一个或多个动态调度指令，其中每个动态调度指令包括用于在多问题处理器中执行的一组指令。

7. 发明授权

US06317821B1 Virtual single-cycle execution in pipelined processors 失效
标题翻译：流水线处理器中的虚拟单周期执行
公开(公告)号：US06317821B1
公开(公告)日：2001-11-13
申请号：US09080787
申请日：1998-05-18
申请人： Dean Batten , Paul Gerard D'Arcy , C. John Glossner , Sanjay Jinturkar , Jesse Thilo
发明人： Dean Batten , Paul Gerard D'Arcy , C. John Glossner , Sanjay Jinturkar , Jesse Thilo
IPC分类号： G06F930
CPC分类号： G06F9/3838 , G06F9/3836 , G06F9/384 , G06F9/3855 , G06F9/3857
摘要： A pipelined processor is configured to provide virtual single-cycle instruction execution using a register locking mechanism in conjunction with instruction stalling based on lock status. In an illustrative embodiment, a set of register locks is maintained in the form of a stored bit vector in which each bit indicates the current lock status of a corresponding register. A decode unit receives an instruction fetched from memory, and decodes the instruction to determine its source and destination registers. The instruction is stalled for at least one processor cycle if either its source register or destination register is already locked by another instruction. The stall continues until the source and destination registers of the instruction are both unlocked, i.e., no longer in use by other instructions. Before the instruction is dispatched for execution, the destination register of the instruction is again locked, and remains locked until after the instruction completes execution and writes its result to the destination register. The decode unit can thus dispatch instructions to execution units of the processor as if the execution of each of the instructions completed in a single processor cycle, in effect ignoring the individual latencies of the execution units. Moreover, the instructions can be dispatched for execution in a program-specified order, but permitted to complete execution in a different order.
摘要翻译：流水线处理器被配置为使用寄存器锁定机制结合基于锁定状态的指令停止来提供虚拟单周期指令执行。在说明性实施例中，一组寄存器锁以存储的位向量的形式保持，其中每个位表示相应寄存器的当前锁定状态。解码单元接收从存储器取出的指令，并解码指令以确定其源和目的寄存器。如果其源寄存器或目标寄存器已被另一个指令锁定，则该指令停止至少一个处理器周期。停顿继续，直到指令的源和目的寄存器都被解锁，即不再被其他指令使用。在执行指令执行之前，指令的目标寄存器再次被锁定，并保持锁定，直到指令完成执行并将其结果写入目标寄存器。解码单元因此可以将指令分派到处理器的执行单元，就好像在单个处理器周期中完成的每个指令的执行，实际上忽略了执行单元的单个延迟。此外，可以按程序指定的顺序调度指令执行，但允许以不同的顺序完成执行。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式