专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

81. 发明申请

US20080313376A1 Heuristic Status Polling 有权
标题翻译：启发式状态轮询
公开(公告)号：US20080313376A1
公开(公告)日：2008-12-18
申请号：US11764282
申请日：2007-06-18
申请人： Charles J. Archer , Michael A. Blocksome , Philip Heidelberger , Sameer Kumar , Jeffrey J. Parker , Joseph D. Ratterman
发明人： Charles J. Archer , Michael A. Blocksome , Philip Heidelberger , Sameer Kumar , Jeffrey J. Parker , Joseph D. Ratterman
IPC分类号： G06F13/22
CPC分类号： H04L12/44 , H04L12/403
摘要： Methods, compute nodes, and computer program products are provided for heuristic status polling of a component in a computing system. Embodiments include receiving, by a polling module from a requesting application, a status request requesting status of a component; determining, by the polling module, whether an activity history for the component satisfies heuristic polling criteria; polling, by the polling module, the component for status if the activity history for the component satisfies the heuristic polling criteria; and not polling, by the polling module, the component for status if the activity history for the component does not satisfy the heuristic criteria.
摘要翻译：提供方法，计算节点和计算机程序产品用于计算系统中组件的启发状态轮询。实施例包括通过轮询模块从请求应用程序接收请求状态的组件的状态请求; 由轮询模块确定该组件的活动历史是否满足启发式轮询标准; 如果组件的活动历史满足启发式轮询标准，轮询由轮询模块组成状态; 如果组件的活动历史不满足启发式标准，则轮询模块不会轮询该组件的状态。

82. 发明申请

US20080301683A1 Performing an Allreduce Operation Using Shared Memory 有权
标题翻译：使用共享内存执行Allreduce操作
公开(公告)号：US20080301683A1
公开(公告)日：2008-12-04
申请号：US11754782
申请日：2007-05-29
申请人： Charles J. Archer , Gabor Dozsa , Joseph D. Ratterman , Brian E. Smith
发明人： Charles J. Archer , Gabor Dozsa , Joseph D. Ratterman , Brian E. Smith
IPC分类号： G06F9/46
CPC分类号： G06F9/4843 , G06F9/52 , G06F9/546
摘要： Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.
摘要翻译：公开了用于使用共享存储器执行全部还原操作的方法，装置和产品，其包括：由计算节点上的多个处理核心中的至少一个接收执行全部降低操作的指令; 通过所述接收到所述指令的核心建立用于指定多个共享存储器全部还原工作单元的作业状态对象，所述多个共享存储器全部还原工作单元一起在所述计算节点上执行全部还原操作; 通过所述计算节点上的可用核确定所述作业状态对象中的下一个共享存储器allreduce工作单元; 并且通过计算节点上的可用核心执行下一个共享存储器allreduce工作单元。

83. 发明申请

US20080281997A1 Low Latency, High Bandwidth Data Communications Between Compute Nodes in a Parallel Computer 失效
标题翻译：并行计算机中计算节点之间的低延迟，高带宽数据通信
公开(公告)号：US20080281997A1
公开(公告)日：2008-11-13
申请号：US11746333
申请日：2007-05-09
申请人： Charles J. Archer , Michael A. Blocksome , Joseph D. Ratterman , Brian E. Smith
发明人： Charles J. Archer , Michael A. Blocksome , Joseph D. Ratterman , Brian E. Smith
IPC分类号： G06F13/28
CPC分类号： G06F13/4269
摘要： Methods, parallel computers, and computer program products are disclosed for low latency, high bandwidth data communications between compute nodes in a parallel computer. Embodiments include receiving, by an origin direct memory access (‘DMA’) engine of an origin compute node, data for transfer to a target compute node; sending, by the origin DMA engine of the origin compute node to a target DMA engine on the target compute node, a request to send (‘RTS’) message; transferring, by the origin DMA engine, a predetermined portion of the data to the target compute node using memory FIFO operation; determining, by the origin DMA engine whether an acknowledgement of the RTS message has been received from the target DMA engine; if the an acknowledgement of the RTS message has not been received, transferring, by the origin DMA engine, another predetermined portion of the data to the target compute node using a memory FIFO operation; and if the acknowledgement of the RTS message has been received by the origin DMA engine, transferring, by the origin DMA engine, any remaining portion of the data to the target compute node using a direct put operation.
摘要翻译：公开了并行计算机和计算机程序产品的方法，用于并行计算机中的计算节点之间的低延迟，高带宽数据通信。实施例包括通过原始计算节点的原始直接存储器访问（“DMA”）引擎接收用于传送到目标计算节点的数据; 由原始计算节点的原始DMA引擎发送到目标计算节点上的目标DMA引擎，发送（'RTS'）消息的请求; 由原始DMA引擎使用存储器FIFO操作将预定部分的数据传送到目标计算节点; 由原始DMA引擎确定是否从目标DMA引擎接收到RTS消息的确认; 如果尚未接收到RTS消息的确认，则由原始DMA引擎使用存储器FIFO操作将另一预定部分的数据传送到目标计算节点; 并且如果原始DMA引擎已经接收到RTS消息的确认，则由原始DMA引擎使用直接放置操作将数据的剩余部分传送到目标计算节点。

84. 发明申请

US20080259916A1 OPPORTUNISTIC QUEUEING INJECTION STRATEGY FOR NETWORK LOAD BALANCING 有权
标题翻译：网络负载平衡机动队列注入策略
公开(公告)号：US20080259916A1
公开(公告)日：2008-10-23
申请号：US11738034
申请日：2007-04-20
申请人： Charles J. Archer , Michael A. Blocksome , Joseph D. Ratterman , Brian E. Smith
发明人： Charles J. Archer , Michael A. Blocksome , Joseph D. Ratterman , Brian E. Smith
IPC分类号： H04L12/28
CPC分类号： H04L45/00 , H04L45/24 , H04L47/10 , H04L47/125
摘要： Embodiments of the invention include a method, system, and article of manufacture that provide opportunistic queuing injection strategy used for data communication between nodes of a parallel computer system. A message may be encapsulated into a set of data packets. When the packets are sent, an opportunistic injection queue may be configured to transmit them to multiple hardware injection ports. This approach allows for complete network link saturation. In a parallel system with network links in multiple dimensions, sending message packets using more than one dimension may substantially increase network throughput.
摘要翻译：本发明的实施例包括提供用于并行计算机系统的节点之间的数据通信的机会排队注入策略的方法，系统和制品。消息可以被封装到一组数据分组中。当发送数据包时，可以配置机会性注入队列将其发送到多个硬件注入端口。这种方法允许完整的网络链路饱和。在具有多个维度的网络链路的并行系统中，使用多个维度发送消息分组可以显着增加网络吞吐量。

85. 发明授权

US09317637B2 Distributed hardware device simulation 有权
标题翻译：分布式硬件设备仿真
公开(公告)号：US09317637B2
公开(公告)日：2016-04-19
申请号：US13006696
申请日：2011-01-14
申请人： Charles J. Archer , Michael A. Blocksome , Joseph D. Ratterman , Brian E. Smith
发明人： Charles J. Archer , Michael A. Blocksome , Joseph D. Ratterman , Brian E. Smith
IPC分类号： G06F9/44 , G06F13/10 , G06F13/12 , G06F17/50 , G06F9/50
CPC分类号： G06F17/5022 , G06F9/5077 , G06F2217/04
摘要： Distributed hardware device simulation, including: identifying a plurality of hardware components of the hardware device; providing software components simulating the functionality of each hardware component, wherein the software components are installed on compute nodes of a distributed processing system; receiving, in at least one of the software components, one or more messages representing an input to the hardware component; simulating the operation of the hardware component with the software component, thereby generating an output of the software component representing the output of the hardware component; and sending, from the software component to at least one other software component, one or more messages representing the output of the hardware component.
摘要翻译：分布式硬件设备仿真，包括：识别硬件设备的多个硬件组件; 提供模拟每个硬件组件的功能的软件组件，其中所述软件组件安装在分布式处理系统的计算节点上; 在所述软件组件中的至少一个中接收表示对所述硬件组件的输入的一个或多个消息; 用软件组件模拟硬件组件的操作，从而生成表示硬件组件的输出的软件组件的输出; 以及从所述软件组件向至少一个其他软件组件发送表示所述硬件组件的输出的一个或多个消息。

86. 发明授权

US08892850B2 Endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface of a parallel computer 有权
标题翻译：基于端点的并行数据处理与并行计算机的并行活动消息接口中的非阻塞集体指令
公开(公告)号：US08892850B2
公开(公告)日：2014-11-18
申请号：US13007848
申请日：2011-01-17
申请人： Charles J. Archer , Michael A. Blocksome , Bob R. Cernohous , Joseph D. Ratterman , Brian E. Smith
发明人： Charles J. Archer , Michael A. Blocksome , Bob R. Cernohous , Joseph D. Ratterman , Brian E. Smith
IPC分类号： G06F9/46 , G06F9/54
CPC分类号： G06F9/54
摘要： Methods, apparatuses, and computer program products for endpoint-based parallel data processing with non-blocking collective instructions in a parallel active messaging interface (‘PAMI’) of a parallel computer are provided. Embodiments include establishing by a parallel application a data communications geometry, the geometry specifying a set of endpoints that are used in collective operations of the PAMI, including associating with the geometry a list of collective algorithms valid for use with the endpoints of the geometry. Embodiments also include registering in each endpoint in the geometry a dispatch callback function for a collective operation and executing without blocking, through a single one of the endpoints in the geometry, an instruction for the collective operation.
摘要翻译：提供了一种用于并行计算机的并行主动消息传递接口（“PAMI”）中基于端点的并行数据处理与非阻塞集体指令的方法，设备和计算机程序产品。实施例包括通过并行应用建立数据通信几何形状，指定在PAMI的集合操作中使用的一组端点的几何形状，包括与几何形状相关联的集合算法列表，该集合算法的列表可与几何的端点一起使用。实施例还包括在几何中的每个端点中注册用于集体操作的分派回调函数，并且通过几何中的单个端点执行不阻塞用于集合操作的指令。

87. 发明授权

US08776081B2 Send-side matching of data communications messages 失效
标题翻译：数据通信消息的发送端匹配
公开(公告)号：US08776081B2
公开(公告)日：2014-07-08
申请号：US12881863
申请日：2010-09-14
申请人： Charles J. Archer , Michael A. Blocksome , Joseph D. Ratterman , Brian E. Smith
发明人： Charles J. Archer , Michael A. Blocksome , Joseph D. Ratterman , Brian E. Smith
IPC分类号： G06F13/00 , G06F15/16 , G06F15/173 , G06F9/54 , G06F9/46
CPC分类号： G06F9/546 , G06F9/46 , G06F9/52 , G06F15/16 , G06F15/17312
摘要： Send-side matching of data communications messages includes a plurality of compute nodes organized for collective operations, including: issuing by a receiving node to source nodes a receive message that specifies receipt of a single message to be sent from any source node, the receive message including message matching information, a specification of a hardware-level mutual exclusion device, and an identification of a receive buffer; matching by two or more of the source nodes the receive message with pending send messages in the two or more source nodes; operating by one of the source nodes having a matching send message the mutual exclusion device, excluding messages from other source nodes with matching send messages and identifying to the receiving node the source node operating the mutual exclusion device; and sending to the receiving node from the source node operating the mutual exclusion device a matched pending message.
摘要翻译：数据通信消息的发送侧匹配包括为集体操作组织的多个计算节点，包括：由接收节点向源节点发出指定从任何源节点发送的单个消息的接收的接收消息，接收消息包括消息匹配信息，硬件级互斥设备的规范以及接收缓冲器的标识; 由两个或多个源节点匹配接收消息与两个或多个源节点中的待发送消息; 由具有匹配发送消息的源节点之一的互斥设备操作，排除来自具有匹配发送消息的其他源节点的消息，并且向接收节点标识操作互斥设备的源节点; 以及从所述源节点向所述接收节点发送操作所述互斥设备匹配的等待消息。

88. 发明授权

US08769034B2 Query performance data on parallel computer system having compute nodes 有权
标题翻译：在具有计算节点的并行计算机系统上查询性能数据
公开(公告)号：US08769034B2
公开(公告)日：2014-07-01
申请号：US13531882
申请日：2012-06-25
申请人： Charles J. Archer , Michael A. Blocksome , Joseph D. Ratterman , Brian E. Smith
发明人： Charles J. Archer , Michael A. Blocksome , Joseph D. Ratterman , Brian E. Smith
IPC分类号： G06F15/16
CPC分类号： G06F11/3404 , G06F11/3409 , G06F2201/88
摘要： Embodiments of the invention provide a method for querying performance counter data on a massively parallel computing system, while minimizing the costs associated with interrupting computer processors and limited memory resources. DMA descriptors may be inserted into an injection FIFO of a remote compute node in the massively parallel computing system. Upon executing the DMA operations described by the DMA descriptors, performance counter data may be transferred from the remote compute node to a destination node.
摘要翻译：本发明的实施例提供了一种在大规模并行计算系统上查询性能计数器数据的方法，同时最小化与中断计算机处理器和有限存储器资源相关联的成本。可以将DMA描述符插入到大规模并行计算系统中的远程计算节点的注入FIFO中。在执行由DMA描述符描述的DMA操作时，性能计数器数据可以从远程计算节点传送到目的地节点。

89. 发明授权

US08752051B2 Performing an allreduce operation using shared memory 失效
标题翻译：使用共享内存执行allreduce操作
公开(公告)号：US08752051B2
公开(公告)日：2014-06-10
申请号：US13427057
申请日：2012-03-22
申请人： Charles J. Archer , Gabor Dozsa , Joseph D. Ratterman , Brian E. Smith
发明人： Charles J. Archer , Gabor Dozsa , Joseph D. Ratterman , Brian E. Smith
IPC分类号： G06F9/46 , G06F9/48 , G06F9/52
CPC分类号： G06F9/4843 , G06F9/52 , G06F9/546
摘要： Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.
摘要翻译：公开了用于使用共享存储器执行全部还原操作的方法，装置和产品，其包括：由计算节点上的多个处理核心中的至少一个接收执行全部降低操作的指令; 通过所述接收到所述指令的核心建立用于指定多个共享存储器全部还原工作单元的作业状态对象，所述多个共享存储器全部还原工作单元一起在所述计算节点上执行全部还原操作; 通过所述计算节点上的可用核确定所述作业状态对象中的下一个共享存储器allreduce工作单元; 并且通过计算节点上的可用核心执行下一个共享存储器allreduce工作单元。

90. 发明申请

US20140047451A1 Optimizing Collective Communications Within A Parallel Computer 有权
标题翻译：并行计算机内集体通信优化
公开(公告)号：US20140047451A1
公开(公告)日：2014-02-13
申请号：US13569614
申请日：2012-08-08
申请人： Charles J. Archer , Michael A. Blocksome , Joseph D. Ratterman , Brian E. Smith
发明人： Charles J. Archer , Michael A. Blocksome , Joseph D. Ratterman , Brian E. Smith
IPC分类号： G06F9/46
CPC分类号： G06F9/5061 , G06F2209/505
摘要： Methods, apparatuses, and computer program products for optimizing collective communications within a parallel computer comprising a plurality of hardware threads for executing software threads of a parallel application are provided. Embodiments include a processor of a parallel computer determining for each software thread, an affinity of the software thread to a particular hardware thread. Each affinity indicates an assignment of a software thread to a particular hardware thread. The processor also generates one or more affinity domains based on the affinities of the software threads. Embodiments also include a processor generating, for each affinity domain, a topology of the affinity domain based on the affinities of the software threads to the hardware threads. According to embodiments of the present application, a processor also performs, based on the generated topologies of the affinity domains, a collective operation on one or more software threads.
摘要翻译：提供了用于优化并行计算机内的集体通信的方法，装置和计算机程序产品，其包括用于执行并行应用的软件线程的多个硬件线程。实施例包括并行计算机的处理器，为每个软件线程确定软件线程与特定硬件线程的亲和度。每个相关性表示将软件线程分配给特定的硬件线程。处理器还基于软件线程的亲和性生成一个或多个关联域。实施例还包括基于软件线程对硬件线程的亲和性，针对每个关联域产生兴趣域的拓扑的处理器。根据本申请的实施例，处理器还基于所生成的关联域的拓扑来执行对一个或多个软件线程的集合操作。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式