会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Redundancy for on-chip interconnect
    • 片上互连冗余
    • US08689159B1
    • 2014-04-01
    • US13612629
    • 2012-09-12
    • Robert PalmerJohn W. PoultonThomas Hastings Greer, IIIWilliam James Dally
    • Robert PalmerJohn W. PoultonThomas Hastings Greer, IIIWilliam James Dally
    • G06F17/50
    • G06F17/5031
    • One embodiment sets forth a technique for on-chip satisfying timing requirements of on-chip source-synchronous, CMOS-repeater-based interconnect. Each channel of the on-chip interconnect may include one or more redundant wires. Calibration logic is configured to apply transition patterns to wires comprising each channel and calibration patterns that are generated in response to the transition patterns are captured. Based on the calibration patterns, wires that best satisfy the timing requirements of the on-chip interconnect are selected for use to transmit data. The calibration logic also trims the delays of the clock and selected data wires based on captured calibration patterns to improve the timing margin of the on-chip interconnect. Improving the timing margin of the on-chip interconnect improves chip yields.
    • 一个实施例提出了片上源同步,基于CMOS中继器的互连的片上满足定时要求的技术。 片上互连的每个通道可以包括一个或多个冗余电线。 校准逻辑被配置为将转换模式应用于包括每个通道的线和响应于捕获转换模式而生成的校准图案。 基于校准模式,选择最能满足片上互连的时序要求的导线用于传输数据。 校准逻辑还基于捕获的校准模式修整时钟和所选数据线的延迟,以提高片上互连的时序裕度。 提高片上互连的时序裕度提高了芯片产量。
    • 2. 发明授权
    • Timing calibration for on-chip interconnect
    • 片上互连的定时校准
    • US08941430B2
    • 2015-01-27
    • US13612614
    • 2012-09-12
    • Robert PalmerJohn W. PoultonThomas Hastings Greer, IIIWilliam James Dally
    • Robert PalmerJohn W. PoultonThomas Hastings Greer, IIIWilliam James Dally
    • H03H11/26
    • H03K5/131H01L2924/0002H03K5/133H01L2924/00
    • One embodiment sets forth a timing calibration technique for on-chip source-synchronous, complementary metal-oxide-semiconductor (CMOS) repeater-based interconnect. Two transition patterns may be applied to calibrate the delay of an on-chip data or clock wire. Calibration logic is configured to apply the transition patterns and then trim the delays of the clock and data wires based on captured calibration patterns. The trimming adjusts the delay of the clock and data wires using a configurable delay circuit. Timing errors may be caused by crosstalk, power-supply-induced jitter (PSIJ), or wire delay variation due to transistor and wire metallization mismatch. Chip yields may be improved by reducing the occurrence of timing errors due to mismatched delays between different wires of an on-chip interconnect.
    • 一个实施例提出了用于片上源同步,互补金属氧化物半导体(CMOS)基于中继器的互连的定时校准技术。 可以应用两个转换模式来校准片上数据或时钟线的延迟。 校准逻辑被配置为应用转换模式,然后基于捕获的校准模式修剪时钟和数据线的延迟。 微调使用可配置的延迟电路来调整时钟和数据线的延迟。 定时误差可能由串扰,电源引起的抖动(PSIJ)或由于晶体管和导线金属化不匹配引起的导线延迟变化引起。 可以通过减少由于片上互连的不同导线之间的不匹配延迟引起的定时误差的出现来提高芯片产量。
    • 6. 发明授权
    • Unified streaming multiprocessor memory
    • 统一流式多处理器内存
    • US09069664B2
    • 2015-06-30
    • US13240366
    • 2011-09-22
    • William James Dally
    • William James Dally
    • G06F13/00G06F12/06G06F13/16
    • G06F12/06G06F13/16G06F13/1605G06F2213/0038
    • One embodiment of the present invention sets forth a technique for providing a unified memory for access by execution threads in a processing system. Several logically separate memories are combined into a single unified memory that includes a single set of shared memory banks, an allocation of space in each bank across the logical memories, a mapping rule that maps the address space of each logical memory to its partition of the shared physical memory, a circuitry including switches and multiplexers that supports the mapping, and an arbitration scheme that allocates access to the banks.
    • 本发明的一个实施例提出了一种用于在处理系统中提供用于由执行线程访问的统一存储器的技术。 几个逻辑上分离的存储器被组合成单个统一存储器,其包括单个共享存储器组集合,跨越逻辑存储器的每个存储体中的空间分配;将每个逻辑存储器的地址空间映射到其分区的映射规则 共享物理存储器,包括支持映射的交换机和多路复用器的电路以及分配对存储体的访问的仲裁方案。
    • 7. 发明授权
    • Hierarchical memory addressing
    • 分层存储器寻址
    • US08982140B2
    • 2015-03-17
    • US13241745
    • 2011-09-23
    • William James Dally
    • William James Dally
    • G06F13/28G06F15/16G06F12/02G06F12/08
    • G06F12/0284G06F12/08G06F12/0811G06F2212/251G06F2212/2515G06F2212/253G06F2212/302G06F2213/0038
    • One embodiment of the present invention sets forth a technique for addressing data in a hierarchical graphics processing unit cluster. A hierarchical address is constructed based on the location of a storage circuit where a target unit of data resides. The hierarchical address comprises a level field indicating a hierarchical level for the unit of data and a node identifier that indicates which GPU within the GPU cluster currently stores the unit of data. The hierarchical address may further comprise one or more identifiers that indicate which storage circuit in a particular hierarchical level currently stores the unit of data. The hierarchical address is constructed and interpreted based on the level field. The technique advantageously enables programs executing within the GPU cluster to efficiently access data residing in other GPUs using the hierarchical address.
    • 本发明的一个实施例提出了一种用于在分层图形处理单元簇中寻址数据的技术。 基于目标数据单元所在的存储电路的位置构建分层地址。 分层地址包括指示数据单元的层次级别的级别字段和指示GPU簇内的GPU当前存储数据单元的节点标识符。 分层地址还可以包括一个或多个标识符,其指示特定层级中的哪个存储电路当前存储数据单元。 层次结构地址是基于层次域构建和解释的。 该技术有利地使得在GPU集群内执行的程序能够使用分层地址高效地访问驻留在其它GPU中的数据。
    • 8. 发明授权
    • Two-level scheduler for multi-threaded processing
    • 用于多线程处理的两级调度器
    • US08732711B2
    • 2014-05-20
    • US13151094
    • 2011-06-01
    • William James DallyStephen William KecklerDavid TarjanJohn Erik LindholmMark Alan GebhartDaniel Robert Johnson
    • William James DallyStephen William KecklerDavid TarjanJohn Erik LindholmMark Alan GebhartDaniel Robert Johnson
    • G06F9/46
    • G06F9/4881G06F9/3851G06F9/3887
    • One embodiment of the present invention sets forth a technique for scheduling thread execution in a multi-threaded processing environment. A two-level scheduler maintains a small set of active threads called strands to hide function unit pipeline latency and local memory access latency. The strands are a sub-set of a larger set of pending threads that is also maintained by the two-leveler scheduler. Pending threads are promoted to strands and strands are demoted to pending threads based on latency characteristics. The two-level scheduler selects strands for execution based on strand state. The longer latency of the pending threads is hidden by selecting strands for execution. When the latency for a pending thread is expired, the pending thread may be promoted to a strand and begin (or resume) execution. When a strand encounters a latency event, the strand may be demoted to a pending thread while the latency is incurred.
    • 本发明的一个实施例提出了一种用于在多线程处理环境中调度线程执行的技术。 一个两级调度程序维护一组称为线索的活动线程,以隐藏功能单元流水线延迟和本地存储器访问延迟。 这些链是一组更大的待处理线程的子集,其也由二级调度器维护。 等待线程被提升为线索,并且基于延迟特性将线降级到等待线程。 两级调度器基于线状态来选择用于执行的线。 通过选择要执行的链来隐藏待处理线程的延迟更长。 当待处理线程的等待时间到期时,挂起的线程可以被提升为一个线并开始(或恢复)执行。 当一条线遇到一个延迟事件时,该链可以被降级到等待线程,同时发生延迟。
    • 9. 发明申请
    • Hierarchical Memory Addressing
    • 分层内存寻址
    • US20120075319A1
    • 2012-03-29
    • US13241745
    • 2011-09-23
    • William James Dally
    • William James Dally
    • G06F13/00G06F12/06
    • G06F12/0284G06F12/08G06F12/0811G06F2212/251G06F2212/2515G06F2212/253G06F2212/302G06F2213/0038
    • One embodiment of the present invention sets forth a technique for addressing data in a hierarchical graphics processing unit cluster. A hierarchical address is constructed based on the location of a storage circuit where a target unit of data resides. The hierarchical address comprises a level field indicating a hierarchical level for the unit of data and a node identifier that indicates which GPU within the GPU cluster currently stores the unit of data. The hierarchical address may further comprise one or more identifiers that indicate which storage circuit in a particular hierarchical level currently stores the unit of data. The hierarchical address is constructed and interpreted based on the level field. The technique advantageously enables programs executing within the GPU cluster to efficiently access data residing in other GPUs using the hierarchical address.
    • 本发明的一个实施例提出了一种用于在分层图形处理单元簇中寻址数据的技术。 基于目标数据单元所在的存储电路的位置构建分层地址。 分层地址包括指示数据单元的层次级别的级别字段和指示GPU簇内的GPU当前存储数据单元的节点标识符。 分层地址还可以包括一个或多个标识符,其指示特定层级中的哪个存储电路当前存储数据单元。 层次结构地址是基于层次域构建和解释的。 该技术有利地使得在GPU集群内执行的程序能够使用分层地址高效地访问驻留在其它GPU中的数据。