专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

61. 发明授权

US06338129B1 Manifold array processor 有权
标题翻译：歧管阵列处理器
公开(公告)号：US06338129B1
公开(公告)日：2002-01-08
申请号：US09323609
申请日：1999-06-01
申请人： Gerald G. Pechanek , Charles W. Kurak, Jr.
发明人： Gerald G. Pechanek , Charles W. Kurak, Jr.
IPC分类号： G06F1516
CPC分类号： G06F15/17381 , G06F9/30076 , G06F15/17337 , G06F15/8023
摘要： An array processor includes processing elements arranged in clusters which are, in turn, combined in a rectangular array. Each cluster is formed of processing elements which preferably communicate with the processing elements of at least two other clusters. Additionally each inter-cluster communication path is mutually exclusive, that is, each path carries either north and west, south and east, north and east, or south and west communications. Due to the mutual exclusivity of the data paths, communications between the processing elements of each cluster may be combined in a single inter-cluster path. That is, communications from a cluster which communicates to the north and east with another cluster may be combined in one path, thus eliminating half the wiring required for the path. Additionally, the length of the longest communication path is not directly determined by the overall dimension of the array, as it is in conventional torus arrays. Rather, the longest communications path is limited only by the inter-cluster spacing. In one implementation, transpose elements of an N×N torus are combined in clusters and communicate with one another through intra-cluster communications paths. Since transpose elements have direct connections to one another, transpose operation latency is eliminated in this approach. Additionally, each PE may have a single transmit port and a single receive port. As a result, the individual PEs are decoupled from the topology of the array.
摘要翻译：阵列处理器包括按簇排列的处理元件，它们依次以矩形阵列组合。每个簇由优选地与至少两个其他簇的处理元件通信的处理元件形成。另外每个集群间的通信路径是相互排斥的，也就是说，每条路径都有北西，南，东，北，东，或南，西通信。由于数据路径的相互独占性，每个集群的处理元件之间的通信可以组合在单个集群间路径中。也就是说，来自与北部和东部与另一个群集通信的群集的通信可以组合在一个路径中，从而消除路径所需的一半布线。此外，最长通信路径的长度不是直接由阵列的整体尺寸决定，就像在传统的环面阵列中一样。相反，最长的通信路径仅受群间间隔限制。在一个实现中，将NxN环面的转置元素组合在一起并通过集群内通信路径相互通信。由于转置元素具有彼此的直接连接，因此在此方法中消除了转置操作延迟。另外，每个PE可以具有单个发送端口和单个接收端口。因此，各个PE与阵列的拓扑结构分离。

62. 发明授权

US06219776B1 Merged array controller and processing element 有权
标题翻译：合并阵列控制器和处理元件
公开(公告)号：US06219776B1
公开(公告)日：2001-04-17
申请号：US09169072
申请日：1998-10-09
申请人： Gerald G. Pechanek , Juan G. Revilla
发明人： Gerald G. Pechanek , Juan G. Revilla
IPC分类号： G06F1300
CPC分类号： G06F9/3885 , G06F9/3012 , G06F9/30145 , G06F9/30189 , G06F9/3887 , G06F15/17343 , G06F15/8007
摘要： A highly parallel data processing system includes an array of n processing elements (PEs) and a controller sequence processor (SP) wherein at least one PE is combined with the controller SP to create a Dynamic Merged Processor (DP) which supports two modes of operation. In its first mode of operation, the DP acts as one of the PEs in the array and participates in the execution of single-instruction-multiple-data (SIMD) instructions. In the second mode of operation, the DP acts as the controlling element for the array of PEs and executes non-array instructions. To support these two modes of operation, the DP includes a plurality of execution units and two general-purpose register files. The execution units are “shared” in that they can execute instructions in either mode of operation. With very long instruction word (VLIW) capability, both modes of operation can be in effect on a cycle by cycle basis for every VLIW executed. This structure allows the controlling element in a highly parallel SIMD processor to be reused as one of the processing elements in the array to reduce the overall number of transistors and wires in the SIMD processor while maintaining its capabilities and performance.
摘要翻译：高度并行的数据处理系统包括n个处理元件（PE）和控制器序列处理器（SP）的阵列，其中至少一个PE与控制器SP组合以创建支持两种操作模式的动态合并处理器（DP）。在其第一种操作模式中，DP充当阵列中的PE之一，并参与执行单指令多数据（SIMD）指令。在第二种操作模式中，DP充当PE阵列的控制元件，并执行非阵列指令。为了支持这两种操作模式，DP包括多个执行单元和两个通用寄存器文件。执行单元是“共享的”，因为它们可以在任一操作模式下执行指令。具有非常长的指令字（VLIW）能力，两种操作模式可以在执行的每个VLIW的基础上逐周期生效。这种结构允许高度并行的SIMD处理器中的控制元件被重新用作阵列中的处理元件之一，以在保持其能力和性能的同时减少SIMD处理器中的晶体管和导线的总数。

63. 发明授权

US06173389B2 Methods and apparatus for dynamic very long instruction word sub-instruction selection for execution time parallelism in an indirect very long instruction word processor 有权
标题翻译：用于动态超长指令字子指令选择的方法和装置，用于间接非常长的指令字处理器中的执行时间并行性
公开(公告)号：US06173389B2
公开(公告)日：2001-01-09
申请号：US09205588
申请日：1998-12-04
申请人： Gerald G. Pechanek , Juan Guillermo Revilla , Edwin F. Barry
发明人： Gerald G. Pechanek , Juan Guillermo Revilla , Edwin F. Barry
IPC分类号： G06F1580
CPC分类号： G06F9/3842 , G06F9/3017 , G06F9/3822 , G06F9/3853
摘要： A pipelined data processing unit includes an instruction sequencer and n functional units capable of executing n operations in parallel. The instruction sequencer includes a random access memory for storing very-long-instruction-words (VLIWs) used in operations involving the execution of two or more functional units in parallel. Each VLIW comprises a plurality of short-instruction-words (SIWs) where each SIW corresponds to a unique type of instruction associated with a unique functional unit. VLIWs are composed in the VLIW memory by loading and concatenating SIWs in each address, or entry. VLIWs are executed via the execute-VLIW (XV) instruction. The iVLIWs can be compressed at a VLIW memory address by use of a mask field contained within the XV1 instruction which specifics which functional units are enabled, or disabled, during the execution of the VLIW. The mask can be changed each time the XV1 instruction is executed, effectively modifying the VLIW every time it is executed. The VLIW memory (VIM) can be further partitioned into separate memories each associated with a function decode-and-execute unit. With a second execute VLIW instruction XV2, each functional unit's VIM can be independently addressed thereby removing duplicate SIWs within the functional unit's VIM. This provides a further optimization of the VLIW storage thereby allowing the use of smaller VLIW memories in cost sensitive applications.
摘要翻译：流水线数据处理单元包括指令定序器和能够并行执行n个操作的n个功能单元。指令定序器包括用于存储在涉及并行执行两个或多个功能单元的操作中使用的非常长指令字（VLIW）的随机存取存储器。每个VLIW包括多个短指令字（SIW），其中每个SIW对应于与唯一功能单元相关联的唯一类型的指令。 VLIW通过在每个地址或条目中加载和连接SIW来组成VLIW存储器。 VLIW通过执行VLIW（XV）指令执行。通过使用包含在XV1指令中的掩码字段，可以在VLIW存储器地址处压缩iVLIW，该掩码字段指定在执行VLIW期间启用或禁用哪些功能单元。每次执行XV1指令时，可以更改掩码，每次执行时都可以有效地修改VLIW。 VLIW存储器（VIM）可以被进一步划分成各自与功能解码和执行单元相关联的存储器。通过第二执行VLIW指令XV2，可以独立地对每个功能单元的VIM进行寻址，从而去除功能单元的VIM内的重复SIW。这提供了VLIW存储器的进一步优化，从而允许在成本敏感的应用中使用较小的VLIW存储器。

64. 发明授权

US6023753A Manifold array processor 失效
公开(公告)号：US6023753A
公开(公告)日：2000-02-08
申请号：US885310
申请日：1997-06-30
申请人： Gerald G. Pechanek , Charles W. Kurak, Jr.
发明人： Gerald G. Pechanek , Charles W. Kurak, Jr.
IPC分类号： G06F15/173 , G06F15/80 , G06F15/00
CPC分类号： G06F15/17381 , G06F15/17337 , G06F15/8023 , G06F9/30076
摘要： An array processor includes processing elements arranged in clusters which are, in turn, combined in a rectangular array. Each cluster is formed of processing elements which preferably communicate with the processing elements of at least two other clusters. Additionally each inter-cluster communication path is mutually exclusive, that is, each path carries either north and west, south and east, north and east, or south and west communications. Due to the mutual exclusivity of the data paths, communications between the processing elements of each cluster may be combined in a single inter-cluster path. That is, communications from a cluster which communicates to the north and east with another cluster may be combined in one path, thus eliminating half the wiring required for the path. Additionally, the length of the longest communication path is not directly determined by the overall dimension of the array, as it is in conventional torus arrays. Rather, the longest communications path is limited only by the inter-cluster spacing. In one implementation, transpose elements of an N.times.N torus are combined in clusters and communicate with one another through intra-cluster communications paths. Since transpose elements have direct connections to one another, transpose operation latency is eliminated in this approach. Additionally, each PE may have a single transmit port and a single receive port. As a result, the individual PEs are decoupled from the topology of the array.

65. 发明申请

US20130007331A1 System Core for Transferring Data Between an External Device and Memory 失效
标题翻译：用于在外部设备和内存之间传输数据的系统核心
公开(公告)号：US20130007331A1
公开(公告)日：2013-01-03
申请号：US13611969
申请日：2012-09-12
申请人： Gerald G. Pechanek , David Carl Strube , Edwin Frank Barry , Charles W. Kurak, JR. , Carl Donald Busboom , Dale Edward Schneider , Nikos P. Pitsianis , Grayson Morris , Edward A. Wolff , Patrick R. Marchand , Ricardo E. Rodriguez , Marco C. Jacobs
发明人： Gerald G. Pechanek , David Carl Strube , Edwin Frank Barry , Charles W. Kurak, JR. , Carl Donald Busboom , Dale Edward Schneider , Nikos P. Pitsianis , Grayson Morris , Edward A. Wolff , Patrick R. Marchand , Ricardo E. Rodriguez , Marco C. Jacobs
IPC分类号： G06F13/36
CPC分类号： G06F15/82 , G06F9/30145 , G06F11/263 , Y10S707/99943
摘要： Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.
摘要翻译：这里描述了使用歧管阵列（ManArray）架构和指令语法的高成本有效和高效的实现的细节。该方法的各个方面包括语法的规律性，指示集可以以数据库形式表示的相对容易程度，可以创建工具的就绪能力，即时生成自检代码和参数化测试用例。可以很容易地映射参数化，并显着简化系统维护。

66. 发明申请

US20110225224A1 Efficient Complex Multiplication and Fast Fourier Transform (FFT) Implementation on the ManArray Architecture 有权
标题翻译：在ManArray架构上实现高效的复数乘法和快速傅里叶变换（FFT）
公开(公告)号：US20110225224A1
公开(公告)日：2011-09-15
申请号：US13116332
申请日：2011-05-26
申请人： Nikos P. Pitsianis , Gerald G. Pechanek , Ricardo E. Rodriguez
发明人： Nikos P. Pitsianis , Gerald G. Pechanek , Ricardo E. Rodriguez
IPC分类号： G06F17/14 , G06F7/499
CPC分类号： G06F15/82 , G06F9/30014 , G06F9/30032 , G06F9/30036 , G06F9/3853 , G06F9/3885 , G06F15/8023 , G06F15/8038 , G06F17/142
摘要： Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.
摘要翻译：提供了复数乘法结果和非常有效的快速傅里叶变换（FFT）的有效计算。采用并行阵列VLIW数字信号处理器以及与计算重叠的处理元件之间的专用复数乘法指令和通信操作，以提供非常高的性能操作。使用紧密封装的VLIW的循环的连续迭代，允许有效地使用复数乘法管线硬件。此外，描述了用于支持组合乘法累加操作的有效技术。

67. 发明申请

US20110219210A1 System Core for Transferring Data Between an External Device and Memory 失效
标题翻译：用于在外部设备和内存之间传输数据的系统核心
公开(公告)号：US20110219210A1
公开(公告)日：2011-09-08
申请号：US13106042
申请日：2011-05-12
申请人： Gerald G. Pechanek , David Carl Strube , Edwin Frank Barry , Charles W. Kurak, JR. , Carl Donald Busboom , Dale Edward Schneider , Nikos P. Pitsianis , Grayson Morris , Edward A. Wolff , Patrick R. Marchand , Ricardo E. Rodriguez , Marco C. Jacobs
发明人： Gerald G. Pechanek , David Carl Strube , Edwin Frank Barry , Charles W. Kurak, JR. , Carl Donald Busboom , Dale Edward Schneider , Nikos P. Pitsianis , Grayson Morris , Edward A. Wolff , Patrick R. Marchand , Ricardo E. Rodriguez , Marco C. Jacobs
IPC分类号： G06F9/312 , G06F9/30
CPC分类号： G06F15/82 , G06F9/30145 , G06F11/263 , Y10S707/99943
摘要： Details of a highly cost effective and efficient implementation of a manifold array (ManArray) architecture and instruction syntax for use therewith are described herein. Various aspects of this approach include the regularity of the syntax, the relative ease with which the instruction set can be represented in database form, the ready ability with which tools can be created, the ready generation of self-checking codes and parameterized test cases. Parameterizations can be fairly easily mapped and system maintenance is significantly simplified.
摘要翻译：这里描述了使用歧管阵列（ManArray）架构和指令语法的高成本有效和高效的实现的细节。该方法的各个方面包括语法的规律性，指示集可以以数据库形式表示的相对容易程度，可以创建工具的就绪能力，即时生成自检代码和参数化测试用例。可以很容易地映射参数化，并显着简化系统维护。

68. 发明申请

US20080301414A1 Efficient Complex Multiplication and Fast Fourier Transform (FFT) Implementation on the ManArray Architecture 有权
标题翻译：在ManArray架构上实现高效的复数乘法和快速傅里叶变换（FFT）
公开(公告)号：US20080301414A1
公开(公告)日：2008-12-04
申请号：US12187746
申请日：2008-08-07
申请人： Nikos P. Pitsianis , Gerald G. Pechanek , Ricardo E. Rodriguez
发明人： Nikos P. Pitsianis , Gerald G. Pechanek , Ricardo E. Rodriguez
IPC分类号： G06F9/302
CPC分类号： G06F15/82 , G06F9/30014 , G06F9/30032 , G06F9/30036 , G06F9/3853 , G06F9/3885 , G06F15/8023 , G06F15/8038 , G06F17/142
摘要： Efficient computation of complex multiplication results and very efficient fast Fourier transforms (FFTs) are provided. A parallel array VLIW digital signal processor is employed along with specialized complex multiplication instructions and communication operations between the processing elements which are overlapped with computation to provide very high performance operation. Successive iterations of a loop of tightly packed VLIWs are used allowing the complex multiplication pipeline hardware to be efficiently used. In addition, efficient techniques for supporting combined multiply accumulate operations are described.
摘要翻译：提供了复数乘法结果和非常有效的快速傅里叶变换（FFT）的有效计算。采用并行阵列VLIW数字信号处理器以及与计算重叠的处理元件之间的专用复数乘法指令和通信操作，以提供非常高的性能操作。使用紧密封装的VLIW的循环的连续迭代，允许有效地使用复数乘法管线硬件。此外，描述了用于支持组合乘法累加操作的有效技术。

69. 发明授权

US07072929B2 Methods and apparatus for efficient complex long multiplication and covariance matrix implementation 失效
标题翻译：用于有效复杂长乘法和协方差矩阵实现的方法和装置
公开(公告)号：US07072929B2
公开(公告)日：2006-07-04
申请号：US10004010
申请日：2001-11-01
申请人： Gerald G. Pechanek , Ricardo Rodriguez , Matthew Plonski , David Strube , Kevin Coopman
发明人： Gerald G. Pechanek , Ricardo Rodriguez , Matthew Plonski , David Strube , Kevin Coopman
IPC分类号： G06F7/52
CPC分类号： G06F7/4812 , G06F9/30014 , G06F9/325 , G06F9/3885 , G06F17/15 , G06F17/16 , G06F2207/3896
摘要： A digital signal processor for computing various types of complex multiplication is described. The digital signal processor operates in conjunction with registers, a multiplier, an adder, and a multiplexer The Registers store first and second complex operands. The multiplier simultaneously performs multiplications to produce each combination of products between the real and imaginary terms of the first and second complex operands. The multiplexer selects which produced products are added to or subtracted from each other based on the type of complex multiplication being performed. The adder simultaneously performs additions and subtractions, if necessary, to produce both real and imaginary results depending on whether the type of complex multiplication being performed is a conjugated operation. The registers store the results of the complex multiplication.
摘要翻译：描述了用于计算各种类型的复数乘法的数字信号处理器。数字信号处理器与寄存器，乘法器，加法器和多路复用器一起操作。寄存器存储第一和第二复数操作数。乘法器同时执行乘法以产生在第一和第二复数操作数的实数和虚数项之间的乘积的每个组合。多路复用器根据正在执行的复数乘法的类型，选择相加或相减的产生的产品。如果需要，加法器同时执行加法和减法，以根据执行的复数乘法是否是共轭运算来产生实数和虚部结果。寄存器存储复数乘法的结果。

70. 发明授权

US06874078B2 Merged control/process element processor for executing VLIW simplex instructions with SISD control/SIMD process mode bit 失效
标题翻译：用于使用SISD控制/ SIMD过程模式位执行VLIW单工指令的合并控制/处理元件处理器
公开(公告)号：US06874078B2
公开(公告)日：2005-03-29
申请号：US10620144
申请日：2003-07-15
申请人： Gerald G. Pechanek , Juan G. Revilla
发明人： Gerald G. Pechanek , Juan G. Revilla
IPC分类号： G06F15/16 , G06F9/30 , G06F9/318 , G06F15/173 , G06F15/80 , G06F9/40
CPC分类号： G06F9/3885 , G06F9/3012 , G06F9/30145 , G06F9/30189 , G06F9/3887 , G06F15/17343 , G06F15/8007
摘要： A highly parallel data processing system includes an array of n processing elements (PEs) and a controller sequence processor (SP) wherein at least one PE is combined with the controller SP to create a Dynamic Merged Processor (DP) which supports two modes of operation. In its first mode of operation, the DP acts as one of the PEs in the array and participates in the execution of single-instruction-multiple-data (SIMD) instructions. In the second mode of operation, the DP acts as the controlling element for the array of PEs and executes non-array instructions. To support these two modes of operation, the DP includes a plurality of execution units and two general-purpose register files. The execution units are “shared” in that they can execute instructions in either mode of operation. With very long instruction word (VLIW) capability, both modes of operation can be in effect on a cycle by cycle basis for every VLIW executed. This structure allows the controlling element in a highly parallel SIMD processor to be reused as one of the processing elements in the array to reduce the overall number of transistors and wires in the SIMD processor while maintaining its capabilities and performance.
摘要翻译：高度并行的数据处理系统包括n个处理元件（PE）和控制器序列处理器（SP）的阵列，其中至少一个PE与控制器SP组合以创建支持两种操作模式的动态合并处理器（DP）。在其第一种操作模式中，DP充当阵列中的PE之一，并参与执行单指令多数据（SIMD）指令。在第二种操作模式中，DP充当PE阵列的控制元件，并执行非阵列指令。为了支持这两种操作模式，DP包括多个执行单元和两个通用寄存器文件。执行单元是“共享的”，因为它们可以在任一操作模式下执行指令。具有非常长的指令字（VLIW）能力，两种操作模式可以在执行的每个VLIW的基础上逐周期生效。这种结构允许高度并行的SIMD处理器中的控制元件被重新用作阵列中的处理元件之一，以在保持其能力和性能的同时减少SIMD处理器中的晶体管和导线的总数。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式