会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 43. 发明申请
    • Performing an Allreduce Operation Using Shared Memory
    • 使用共享内存执行Allreduce操作
    • US20080301683A1
    • 2008-12-04
    • US11754782
    • 2007-05-29
    • Charles J. ArcherGabor DozsaJoseph D. RattermanBrian E. Smith
    • Charles J. ArcherGabor DozsaJoseph D. RattermanBrian E. Smith
    • G06F9/46
    • G06F9/4843G06F9/52G06F9/546
    • Methods, apparatus, and products are disclosed for performing an allreduce operation using shared memory that include: receiving, by at least one of a plurality of processing cores on a compute node, an instruction to perform an allreduce operation; establishing, by the core that received the instruction, a job status object for specifying a plurality of shared memory allreduce work units, the plurality of shared memory allreduce work units together performing the allreduce operation on the compute node; determining, by an available core on the compute node, a next shared memory allreduce work unit in the job status object; and performing, by that available core on the compute node, that next shared memory allreduce work unit.
    • 公开了用于使用共享存储器执行全部还原操作的方法,装置和产品,其包括:由计算节点上的多个处理核心中的至少一个接收执行全部降低操作的指令; 通过所述接收到所述指令的核心建立用于指定多个共享存储器全部还原工作单元的作业状态对象,所述多个共享存储器全部还原工作单元一起在所述计算节点上执行全部还原操作; 通过所述计算节点上的可用核确定所述作业状态对象中的下一个共享存储器allreduce工作单元; 并且通过计算节点上的可用核心执行下一个共享存储器allreduce工作单元。
    • 44. 发明申请
    • Low Latency, High Bandwidth Data Communications Between Compute Nodes in a Parallel Computer
    • 并行计算机中计算节点之间的低延迟,高带宽数据通信
    • US20080281997A1
    • 2008-11-13
    • US11746333
    • 2007-05-09
    • Charles J. ArcherMichael A. BlocksomeJoseph D. RattermanBrian E. Smith
    • Charles J. ArcherMichael A. BlocksomeJoseph D. RattermanBrian E. Smith
    • G06F13/28
    • G06F13/4269
    • Methods, parallel computers, and computer program products are disclosed for low latency, high bandwidth data communications between compute nodes in a parallel computer. Embodiments include receiving, by an origin direct memory access (‘DMA’) engine of an origin compute node, data for transfer to a target compute node; sending, by the origin DMA engine of the origin compute node to a target DMA engine on the target compute node, a request to send (‘RTS’) message; transferring, by the origin DMA engine, a predetermined portion of the data to the target compute node using memory FIFO operation; determining, by the origin DMA engine whether an acknowledgement of the RTS message has been received from the target DMA engine; if the an acknowledgement of the RTS message has not been received, transferring, by the origin DMA engine, another predetermined portion of the data to the target compute node using a memory FIFO operation; and if the acknowledgement of the RTS message has been received by the origin DMA engine, transferring, by the origin DMA engine, any remaining portion of the data to the target compute node using a direct put operation.
    • 公开了并行计算机和计算机程序产品的方法,用于并行计算机中的计算节点之间的低延迟,高带宽数据通信。 实施例包括通过原始计算节点的原始直接存储器访问(“DMA”)引擎接收用于传送到目标计算节点的数据; 由原始计算节点的原始DMA引擎发送到目标计算节点上的目标DMA引擎,发送('RTS')消息的请求; 由原始DMA引擎使用存储器FIFO操作将预定部分的数据传送到目标计算节点; 由原始DMA引擎确定是否从目标DMA引擎接收到RTS消息的确认; 如果尚未接收到RTS消息的确认,则由原始DMA引擎使用存储器FIFO操作将另一预定部分的数据传送到目标计算节点; 并且如果原始DMA引擎已经接收到RTS消息的确认,则由原始DMA引擎使用直接放置操作将数据的剩余部分传送到目标计算节点。
    • 46. 发明申请
    • Identifying Messaging Completion on a Parallel Computer
    • 识别并行计算机上的消息完成
    • US20080195840A1
    • 2008-08-14
    • US11672989
    • 2007-02-09
    • Charles J. ArcherCamesha R. HardwickPatrick J. McCarthyBrian P. Wallenfelt
    • Charles J. ArcherCamesha R. HardwickPatrick J. McCarthyBrian P. Wallenfelt
    • G06F15/76G06F9/06
    • G06F15/17337
    • Methods, parallel computers, and products are provided for identifying messaging completion on a parallel computer. The parallel computer includes a plurality of compute nodes, the compute nodes coupled for data communications by at least two independent data communications networks including a binary tree data communications network optimal for collective operations that organizes the nodes as a tree and a torus data communications network optimal for point to point operations that organizes the nodes as a torus. Embodiments include reading all counters at each node of the torus data communications network; calculating at each node a current node value in dependence upon the values read from the counters at each node; and determining for all nodes whether the current node value for each node is the same as a previously calculated node value for each node. If the current node is the same as the previously calculated node value for all nodes of the torus data communications network, embodiments include determining that messaging is complete and if the current node is not the same as the previously calculated node value for all nodes of the torus data communications network, embodiments include determining that messaging is currently incomplete.
    • 提供方法,并行计算机和产品用于标识并行计算机上的消息完成。 并行计算机包括多个计算节点,所述计算节点被耦合用于由至少两个独立的数据通信网络进行数据通信,所述至少两个独立数据通信网络包括最佳的用于将节点组织为树的二进制树数据通信网络和圆环数据通信网络最优 用于将节点组织为环面的点对点操作。 实施例包括读取环面数据通信网络的每个节点处的所有计数器; 根据从每个节点处的计数器读取的值,在每个节点计算当前节点值; 以及为所有节点确定每个节点的当前节点值是否与每个节点的先前计算的节点值相同。 如果当前节点与圆环数据通信网络的所有节点的先前计算的节点值相同,则实施例包括确定消息传递完成,并且如果当前节点与先前计算出的节点的所有节点的节点值不相同 环面数据通信网络,实施例包括确定消息传递当前不完整。
    • 47. 发明申请
    • Parallel Execution of Operations for a Partitioned Binary Radix Tree on a Parallel Computer
    • 并行计算机上并行执行分区二进制基树的操作
    • US20080126739A1
    • 2008-05-29
    • US11531846
    • 2006-09-14
    • Charles J. ArcherBenjamin E. LynamGary R. Ricard
    • Charles J. ArcherBenjamin E. LynamGary R. Ricard
    • G06F12/00
    • G06F17/30327G06F17/30445Y10S707/99937
    • Methods, apparatus, and products are disclosed for parallel execution of operations for a partitioned binary radix tree that include: receiving, in a parallel computer, an operational entry for the PBRT, the PBRT comprising a plurality of logical pages that contain a plurality of entries, each logical page included in a tier and containing one or more subentries corresponding to the tier of the logical page containing the subentry, each entry is composed of a subentry from each logical page on an entry path; processing in parallel, on the parallel computer, each logical page in each tier, including: identifying a portion of the operational entry that corresponds to the tier of the logical page, and performing an operation on the logical page in dependence upon the identified portion of the operational entry for the tier; and selecting operation results from the logical pages on the entry path for the operational entry.
    • 公开了用于并行执行分区二进制基树的操作的方法,装置和产品,包括:在并行计算机中接收PBRT的操作条目,PBRT包括包含多个条目的多个逻辑页面 包含在层中并且包含与包含子条目的逻辑页的层相对应的一个或多个子条目的每个逻辑页面,每个条目由入口路径上每个逻辑页面的子条目组成; 在并行计算机上并行处理每层中的每个逻辑页面,包括:识别对应于逻辑页面层的操作条目的一部分,以及根据所识别的部分的逻辑页面对逻辑页面执行操作 层次的操作入口; 以及从用于操作条目的入口路径上的逻辑页面中选择操作结果。
    • 48. 发明申请
    • Identifying Failure in a Tree Network of a Parallel Computer
    • 识别并行计算机的树网络中的故障
    • US20080072101A1
    • 2008-03-20
    • US11531787
    • 2006-09-14
    • Charles J. ArcherKurt W. PinnowBrian P. Wallenfelt
    • Charles J. ArcherKurt W. PinnowBrian P. Wallenfelt
    • G06F11/00
    • G06F11/3409G06F11/2236G06F11/3485G06F2201/81
    • Methods, parallel computers, and products are provided for identifying failure in a tree network of a parallel computer. The parallel computer includes one or more processing sets including an I/O node and a plurality of compute nodes. For each processing set embodiments include selecting a set of test compute nodes, the test compute nodes being a subset of the compute nodes of the processing set; measuring the performance of the I/O node of the processing set; measuring the performance of the selected set of test compute nodes; calculating a current test value in dependence upon the measured performance of the I/O node of the processing set, the measured performance of the set of test compute nodes, and a predetermined value for I/O node performance; and comparing the current test value with a predetermined tree performance threshold. If the current test value is below the predetermined tree performance threshold, embodiments include selecting another set of test compute nodes. If the current test value is not below the predetermined tree performance threshold, embodiments include selecting from the test compute nodes one or more potential problem nodes and testing individually potential problem nodes and links to potential problem nodes.
    • 提供方法,并行计算机和产品用于识别并行计算机的树形网络中的故障。 并行计算机包括一个或多个包括I / O节点和多个计算节点的处理集合。 对于每个处理集合,实施例包括选择一组测试计算节点,测试计算节点是处理集合的计算节点的子集; 测量处理集的I / O节点的性能; 测量所选择的一组测试计算节点的性能; 根据测量的处理集合的I / O节点的性能,测试计算节点集合的测量性能以及I / O节点性能的预定值来计算当前测试值; 以及将当前测试值与预定树性能阈值进行比较。 如果当前测试值低于预定树性能阈值,则实施例包括选择另一组测试计算节点。 如果当前测试值不低于预定树性能阈值,则实施例包括从测试计算节点选择一个或多个潜在问题节点,并单独测试潜在问题节点和到潜在问题节点的链路。