专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US07149920B2 Deterministic error recovery protocol 失效
标题翻译：确定性错误恢复协议
公开(公告)号：US07149920B2
公开(公告)日：2006-12-12
申请号：US10674952
申请日：2003-09-30
申请人： Matthew A. Blumrich , Dong Chen , Alan G. Gara , Philip Heidelberger , Dirk I. Hoenicke , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas
发明人： Matthew A. Blumrich , Dong Chen , Alan G. Gara , Philip Heidelberger , Dirk I. Hoenicke , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas
IPC分类号： G06F11/00 , G06F11/07
CPC分类号： G06F11/1443 , G06F11/0709 , G06F11/0793 , H04L1/0052 , H04L69/28 , H04L69/40 , H04L2001/0092
摘要： Disclosed are an error recovery method and system for use with a communication system having first and second nodes, each of said nodes having a receiver and a sender, the sender of the first node being connected to the receiver of the second node by a first cable, and the sender of the second node being connected to the receiver of the first node by a second cable. The method comprising the step of after one of the nodes detects an error, both of the nodes entering the same defined state. In particular, the receiver of the first node enters an error state, stays in the error state for a defined period of time T, and, after said defined period of time T, enters a wait state. Also, the sender of the first node sends to the receiver of the second node an error message for a defined period of time Te, and after the defined period of time Te, the sender of the first node enters an idle state.
摘要翻译：公开了一种用于与具有第一和第二节点的通信系统一起使用的错误恢复方法和系统，每个所述节点具有接收器和发送器，第一节点的发送器通过第一电缆连接到第二节点的接收器并且第二节点的发送者通过第二电缆连接到第一节点的接收器。所述方法包括在所述节点中的一个检测到错误之后的两个节点进入相同的定义状态的步骤。特别地，第一节点的接收机进入错误状态，在定义的时间段T内保持在错误状态，并且在所述定义的时间段T之后进入等待状态。此外，第一节点的发送方在给定的时间段Te的情况下向第二节点的接收者发送错误消息，并且在定义的时间段Te之后，第一节点的发送者进入空闲状态。

2. 发明授权

US07650434B2 Global tree network for computing structures enabling global processing operations 失效
标题翻译：用于计算结构的全局树网络，实现全球处理操作
公开(公告)号：US07650434B2
公开(公告)日：2010-01-19
申请号：US10469000
申请日：2002-02-25
申请人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas
发明人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas
IPC分类号： G06F15/16
CPC分类号： G06F15/17337
摘要： A system and method for enabling high-speed, low-latency global tree network communications among processing nodes interconnected according to a tree network structure. The global tree network enables collective reduction operations to be performed during parallel algorithm operations executing in a computer structure having a plurality of the interconnected processing nodes. Router devices are included that interconnect the nodes of the tree via links to facilitate performance of low-latency global processing operations at nodes of the virtual tree and sub-tree structures. The global operations performed include one or more of: broadcast operations downstream from a root node to leaf nodes of a virtual tree, reduction operations upstream from leaf nodes to the root node in the virtual tree, and point-to-point message passing from any node to the root node. The global tree network is configurable to provide global barrier and interrupt functionality in asynchronous or synchronized manner, and, is physically and logically partitionable.
摘要翻译：一种用于根据树网络结构互连的处理节点之间实现高速，低延迟的全局树网络通信的系统和方法。全局树网络使得能够在具有多个互连的处理节点的计算机结构中执行并行算法操作期间执行集合缩减操作。包括通过链路互连树节点的路由器设备，以便于在虚拟树和子树结构的节点处执行低延迟全局处理操作。执行的全局操作包括以下一个或多个：从根节点向下游到虚拟树的叶节点的广播操作，从叶节点向上到叶节点到虚拟树中的根节点的减少操作，以及从任何节点到根节点。全局树网络可配置为以异步或同步方式提供全局屏障和中断功能，并且在物理和逻辑上可分区。

3. 发明授权

US07587516B2 Class network routing 失效
标题翻译：类网络路由
公开(公告)号：US07587516B2
公开(公告)日：2009-09-08
申请号：US10468999
申请日：2002-02-25
申请人： Gyan Bhanot , Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas
发明人： Gyan Bhanot , Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas
IPC分类号： G06F15/173 , H04L12/66 , H04L12/50
CPC分类号： H04L45/16 , H04L45/06
摘要： Class network routing is implemented in a network such as a computer network comprising a plurality of parallel compute processors at nodes thereof. Class network routing allows a compute processor to broadcast a message to a range (one or more) of other compute processors in the computer network, such as processors in a column or a row. Normally this type of operation requires a separate message to be sent to each processor. With class network routing pursuant to the invention, a single message is sufficient, which generally reduces the total number of messages in the network as well as the latency to do a broadcast. Class network routing is also applied to dense matrix inversion algorithms on distributed memory parallel supercomputers with hardware class function (multicast) capability. This is achieved by exploiting the fact that the communication patterns of dense matrix inversion can be served by hardware class functions, which results in faster execution times.
摘要翻译：在诸如包括在其节点处的多个并行计算处理器的计算机网络的网络中实现类网络路由。类网络路由允许计算处理器将消息广播到计算机网络中的其他计算处理器的范围（一个或多个），例如列或行中的处理器。通常这种类型的操作需要单独的消息发送到每个处理器。根据本发明的类网络路由，单个消息是足够的，这通常减少了网络中的消息总数以及进行广播的延迟。类网络路由也适用于具有硬件类功能（组播）能力的分布式存储并行超级计算机上的密集矩阵求逆算法。这是通过利用密集矩阵反演的通信模式可以通过硬件类功能来实现的，这导致更快的执行时间。

4. 发明授权

US07174434B2 Low latency memory access and synchronization 失效
标题翻译：低延迟内存访问和同步
公开(公告)号：US07174434B2
公开(公告)日：2007-02-06
申请号：US10468994
申请日：2002-02-25
申请人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas
发明人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas
IPC分类号： G06F12/12
CPC分类号： G06F9/52
摘要： A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple prefetching for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefetch rather than some other predictive algorithm. This enables hardware to effectively prefetch memory access patterns that are non-contiguous, but repetitive.
摘要翻译：与弱有序的多处理器系统相关联地提供低延迟存储器系统访问。多处理器中的每个处理器共享资源，并且每个共享资源在锁定设备内具有关联的锁，其提供对多处理器中的多个处理器之间的同步的支持以及资源的有序共享。当处理器拥有与该资源相关联的锁定时，处理器仅具有访问资源的权限，并且处理器拥有锁的尝试仅需要单个加载操作，而不是传统的原子负载后跟存储，使得处理器只执行读取操作，并且硬件锁定装置执行后续的写入操作而不是处理器。还公开了用于非连续数据结构的简单预取。重新定义存储器线，使得除了正常的物理存储器数据之外，每行包括足够大的指针以指向存储器中的任何其他行，其中指针用于确定要预取的存储器行而不是一些其它预测算法。这使得硬件能够有效地预取不连续但重复的存储器访问模式。

5. 发明授权

US07529895B2 Method for prefetching non-contiguous data structures 失效
标题翻译：预取非连续数据结构的方法
公开(公告)号：US07529895B2
公开(公告)日：2009-05-05
申请号：US11617276
申请日：2006-12-28
申请人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas
发明人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Martin Ohmacht , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas
IPC分类号： G06F13/28
CPC分类号： G06F12/0862 , G06F9/52 , G06F2212/6028
摘要： A low latency memory system access is provided in association with a weakly-ordered multiprocessor system. Each processor in the multiprocessor shares resources, and each shared resource has an associated lock within a locking device that provides support for synchronization between the multiple processors in the multiprocessor and the orderly sharing of the resources. A processor only has permission to access a resource when it owns the lock associated with that resource, and an attempt by a processor to own a lock requires only a single load operation, rather than a traditional atomic load followed by store, such that the processor only performs a read operation and the hardware locking device performs a subsequent write operation rather than the processor. A simple perfecting for non-contiguous data structures is also disclosed. A memory line is redefined so that in addition to the normal physical memory data, every line includes a pointer that is large enough to point to any other line in the memory, wherein the pointers to determine which memory line to prefect rather than some other predictive algorithm. This enables hardware to effectively prefect memory access patterns that are non-contiguous, but repetitive.
摘要翻译：与弱有序的多处理器系统相关联地提供低延迟存储器系统访问。多处理器中的每个处理器共享资源，并且每个共享资源在锁定设备内具有关联的锁，其提供对多处理器中的多个处理器之间的同步的支持以及资源的有序共享。当处理器拥有与该资源相关联的锁定时，处理器仅具有访问资源的权限，并且处理器拥有锁的尝试仅需要单个加载操作，而不是传统的原子负载后跟存储，使得处理器只执行读取操作，并且硬件锁定装置执行后续的写入操作而不是处理器。还公开了用于非连续数据结构的简单完善。存储器线被重新定义，使得除了正常的物理存储器数据之外，每行包括足够大的指针以指向存储器中的任何其他行，其中指针用于确定哪个存储器行被提供而不是一些其它预测算法。这使得硬件能够有效地预处理不连续但重复的存储器访问模式。

6. 发明授权

US07305487B2 Optimized scalable network switch 失效
标题翻译：优化可扩展网络交换机
公开(公告)号：US07305487B2
公开(公告)日：2007-12-04
申请号：US10469001
申请日：2002-02-25
申请人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas
发明人： Matthias A. Blumrich , Dong Chen , Paul W. Coteus , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Burkhard D. Steinmacher-Burow , Todd E. Takken , Pavlos M. Vranas
IPC分类号： G06F15/173
CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338
摘要： In a massively parallel computing system having a plurality of nodes configured in m multi-dimensions, each node including a computing device, a method for routing packets towards their destination nodes is provided which includes generating at least one of a 2m plurality of compact bit vectors containing information derived from downstream nodes. A multilevel arbitration process in which downstream information stored in the compact vectors, such as link status information and fullness of downstream buffers, is used to determine a preferred direction and virtual channel for packet transmission. Preferred direction ranges are encoded and virtual channels are selected by examining the plurality of compact bit vectors. This dynamic routing method eliminates the necessity of routing tables, thus enhancing scalability of the switch.
摘要翻译：在具有以m多维配置的多个节点的大规模并行计算系统中，每个节点包括计算设备，提供了用于向分组朝向其目的地节点路由分组的方法，其包括生成2m个多个紧凑比特向量中的至少一个包含从下游节点导出的信息。存储在紧凑向量中的下行信息（诸如链路状态信息和下游缓冲器的丰满度）的多级仲裁过程被用于确定分组传输的优选方向和虚拟信道。优选的方向范围被编码，并且通过检查多个紧凑比特向量来选择虚拟信道。这种动态路由方法消除了路由表的必要性，从而增强了交换机的可扩展性。

7. 发明授权

US07486619B2 Multidimensional switch network 失效
标题翻译：多维交换机网络
公开(公告)号：US07486619B2
公开(公告)日：2009-02-03
申请号：US10793068
申请日：2004-03-04
申请人： Dong Chen , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas , Matthias Augustin Blumrich
发明人： Dong Chen , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Dirk Hoenicke , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas , Matthias Augustin Blumrich
IPC分类号： H04L12/28
CPC分类号： H04L49/1576 , H04L45/06
摘要： Multidimensional switch data networks are disclosed, such as are used by a distributed-memory parallel computer, as applied for example to computations in the field of life sciences. A distributed memory parallel computing system comprises a number of parallel compute nodes and a message passing data network connecting the compute nodes together. The data network connecting the compute nodes comprises a multidimensional switch data network of compute nodes having N dimensions, and a number/array of compute nodes Ln in each of the N dimensions. Each compute node includes an N port routing element having a port for each of the N dimensions. Each compute node of an array of Ln compute nodes in each of the N dimensions connects through a port of its routing element to an Ln port crossbar switch having Ln ports. Several embodiments are disclosed of a 4 dimensional computing system having 65,536 compute nodes.
摘要翻译：公开了多维交换机数据网络，例如由分布式存储器并行计算机使用的，例如应用于生命科学领域的计算。分布式存储器并行计算系统包括多个并行计算节点和将计算节点连接在一起的消息传递数据网络。连接计算节点的数据网络包括具有N维的计算节点的多维交换机数据网络和N个维度中的每一个中的计算节点Ln的数量/数组。每个计算节点包括具有用于N个维度中的每一个的端口的N端口路由元件。每个N维中的Ln计算节点阵列的每个计算节点通过其路由元素的端口连接到具有Ln端口的Ln端口交叉开关。公开了具有65,536个计算节点的四维计算系统的几个实施例。

8. 发明授权

US07315877B2 Efficient implementation of a multidimensional fast fourier transform on a distributed-memory parallel multi-node computer 有权
标题翻译：在分布式存储器并行多节点计算机上高效实现多维快速傅里叶变换
公开(公告)号：US07315877B2
公开(公告)日：2008-01-01
申请号：US10468998
申请日：2002-02-25
申请人： Gyan V. Bhanot , Dong Chen , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas
发明人： Gyan V. Bhanot , Dong Chen , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas
IPC分类号： G06F17/14
CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338
摘要： The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via “all-to-all” distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The “all-to-all” re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.
摘要翻译：发明内容涉及一种用于有效地实现多维阵列的多维快速傅里叶变换（FFT）的方法，系统和程序存储设备，所述多维阵列包括最初分布在多节点计算机系统中的多个元素，所述多节点包括多个节点通过网络进行通信，包括：通过所述网络在所述计算机系统的所述多个节点之间以第一维度分布所述阵列的所述多个元素以促进第一一维FFT; 对分布在第一维度中的每个节点的阵列的元素执行第一个一维FFT; 通过网络上的计算机系统的其他节点以随机顺序的“全对全”分布，在第二维度中的每个节点处重新分布一维FFT变换的元素; 以及对在所述第二维度中的每个节点处重新分布的阵列的元素执行第二一维FFT，其中所述随机顺序有助于所述网络的有效利用，从而有效地实现所述多维FFT。在分布式存储器并行超级计算机上的多维FFT以外的应用中，数组元素的“全部”重新分配进一步有效地实现。

9. 发明授权

US07313582B2 Arithmetic functions in torus and tree networks 失效
标题翻译：圆环和树网络中的算术函数
公开(公告)号：US07313582B2
公开(公告)日：2007-12-25
申请号：US10468991
申请日：2002-02-25
申请人： Gyan Bhanot , Matthias A. Blumrich , Dong Chen , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas
发明人： Gyan Bhanot , Matthias A. Blumrich , Dong Chen , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas
IPC分类号： G06F7/38
CPC分类号： G06F15/17337
摘要： Methods and systems for performing arithmetic functions. In accordance with a first aspect of the invention, methods and apparatus are provided, working in conjunction of software algorithms and hardware implementation of class network routing, to achieve a very significant reduction in the time required for global arithmetic operation on the torus. Therefore, it leads to greater scalability of applications running on large parallel machines. The invention involves three steps in improving the efficiency and accuracy of global operations: (1) Ensuring, when necessary, that all the nodes do the global operation on the data in the same order and so obtain a unique answer, independent of roundoff error; (2) Using the topology of the torus to minimize the number of hops and the bidirectional capabilities of the network to reduce the number of time steps in the data transfer operation to an absolute minimum; and (3) Using class function routing to reduce latency in the data transfer. With the method of this invention, every single element is injected into the network only once and it will be stored and forwarded without any further software overhead. In accordance with a second aspect of the invention, methods and systems are provided to efficiently implement global arithmetic operations on a network that supports the global combining operations. The latency of doing such global operations are greatly reduced by using these methods.
摘要翻译：用于执行算术功能的方法和系统。根据本发明的第一方面，提供了方法和装置，其结合软件算法和类网络路由的硬件实现，以实现对环面上的全局算术运算所需的时间的非常显着的减少。因此，它可以提高在大型并行机上运行的应用程序的可扩展性。本发明涉及提高全球运营效率和准确性三个步骤：（1）在必要时确保所有节点按照相同顺序对数据进行全局运算，从而获得独立的回答，而不考虑舍入误差; （2）使用环面的拓扑来最小化跳数和网络的双向能力，将数据传输操作中的时间步数减少到绝对最小值; 和（3）使用类函数路由来减少数据传输中的延迟。利用本发明的方法，每个单个元件仅被注入到网络中一次，并且它将被存储和转发而没有任何进一步的软件开销。根据本发明的第二方面，提供了用于在支持全局组合操作的网络上有效地实现全局算术运算的方法和系统。通过使用这些方法大大减少了进行这种全局操作的延迟。

10. 发明授权

US08095585B2 Efficient implementation of multidimensional fast fourier transform on a distributed-memory parallel multi-node computer 失效
标题翻译：在分布式存储并行多节点计算机上高效实现多维快速傅里叶变换
公开(公告)号：US08095585B2
公开(公告)日：2012-01-10
申请号：US11931898
申请日：2007-10-31
申请人： Gyan V. Bhanot , Dong Chen , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas
发明人： Gyan V. Bhanot , Dong Chen , Alan G. Gara , Mark E. Giampapa , Philip Heidelberger , Burkhard D. Steinmacher-Burow , Pavlos M. Vranas
IPC分类号： G06F17/14
CPC分类号： H05K7/20836 , F24F11/77 , G06F9/52 , G06F9/526 , G06F15/17381 , G06F17/142 , G09G5/008 , H04L7/0338
摘要： The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via “all-to-all” distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The “all-to-all” re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.
摘要翻译：发明内容涉及一种用于有效地实现多维阵列的多维快速傅里叶变换（FFT）的方法，系统和程序存储设备，所述多维阵列包括最初分布在多节点计算机系统中的多个元素，所述多节点包括多个节点通过网络进行通信，包括：通过所述网络在所述计算机系统的所述多个节点之间以第一维度分布所述阵列的所述多个元素以促进第一一维FFT; 对分布在第一维度中的每个节点的阵列的元素执行第一个一维FFT; 通过网络上的计算机系统的其他节点以随机顺序的“全对全”分布，在第二维度中的每个节点处重新分布一维FFT变换的元素; 以及对在所述第二维度中的每个节点处重新分布的阵列的元素执行第二一维FFT，其中所述随机顺序有助于所述网络的有效利用，从而有效地实现所述多维FFT。在分布式存储器并行超级计算机上的多维FFT以外的应用中，数组元素的“全部”重新分配进一步有效地实现。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式