专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

41. 发明授权

US5542058A Pipelined computer with operand context queue to simplify context-dependent execution flow 失效
标题翻译：具有操作数上下文队列的流水线计算机，以简化与上下文相关的执行流程
公开(公告)号：US5542058A
公开(公告)日：1996-07-30
申请号：US317427
申请日：1994-10-04
申请人： John E. Brown, III , G. Michael Uhler , John H. Edmondson , Debra Bernstein
发明人： John E. Brown, III , G. Michael Uhler , John H. Edmondson , Debra Bernstein
IPC分类号： F02B75/02 , G06F9/38 , G06F9/30
CPC分类号： G06F9/3824 , G06F9/383 , F02B2075/025
摘要： A macropipelined microprocessor chip adheres to strict read and write ordering by sequentially buffering operands in queues during instruction decode, then removing the operands in order during instruction execution. Any instruction that requires additional access to memory inserts the requests into the queued sequence (in a specifier queue) such that read and write ordering is preserved. A specifier queue synchronization counter captures synchronization points to coordinate memory request operations among the autonomous instruction decode unit, instruction execution unit, and memory sub-system. The synchronization method does not restrict the benefit of overlapped execution in the pipelined. Another feature is treatment of a variable bit field operand type that does not restrict the location of operand data. Instruction execution flows in a pipelined processor having such an operand type are vastly different depending on whether operand data resides in registers or memory. Thus, an operand context queue (field queue) is used to simplify context-dependent execution flow and increase overlap. The field queue allows the instruction decode unit to issue instructions with variable bit field operands normally, sequentially identifying and fetching operands, and communicating the operand context that specifies register or memory residence across the pipeline boundaries to the autonomous execution unit. The mechanism creates opportunity for increasing the overlap of pipelined functions and greatly simplifies the splitting of execution flows.
摘要翻译：宏指令微处理器芯片通过在指令解码期间依次缓冲队列中的操作数，然后在指令执行期间依次移除操作数，从而遵循严格的读写顺序。任何需要对内存进行访问的指令将请求插入排队的序列（在指定符队列中），以便保留读写顺序。指定符队列同步计数器捕获同步点以协调自主指令解码单元，指令执行单元和存储器子系统之间的存储器请求操作。同步方法不限制流水线重叠执行的好处。另一个特征是处理不限制操作数数据位置的可变位字段操作数类型。具有这种操作数类型的流水线处理器中的指令执行流程根据操作数数据位于寄存器或存储器中而大不相同。因此，操作数上下文队列（字段队列）用于简化上下文相关的执行流程并增加重叠。字段队列允许指令解码单元通常发送具有可变位字段操作数的指令，顺序地识别和取出操作数，以及将指定流水线边界的寄存器或存储器驻留的操作数上下文传送到自主执行单元。该机制为增加流水线功能的重叠创造了机会，并大大简化了执行流程的拆分。

42. 发明授权

US5471591A Combined write-operand queue and read-after-write dependency scoreboard 失效
标题翻译：组合写操作数队列和读写依赖记分板
公开(公告)号：US5471591A
公开(公告)日：1995-11-28
申请号：US969126
申请日：1992-10-30
申请人： John H. Edmondson , Larry L. Biro
发明人： John H. Edmondson , Larry L. Biro
IPC分类号： F02B75/02 , G06F9/38 , G06F12/08 , G06F9/312
CPC分类号： G06F12/0804 , G06F12/0811 , G06F12/0831 , G06F9/3836 , G06F9/3838 , G06F9/3857 , F02B2075/025
摘要： In a pipelined digital computer, an instruction decoder decodes register specifiers from multiple instructions, and stores them in a source queue and a destination queue. An execution unit successively obtains source specifiers of an instruction from the source queue, initiates an operation upon the source specifiers, reads a destination specifier from the destination queue, and retires the result at the specified destination. Read-after-write conflicts may occur because the execution unit may overlap execution of a plurality of instructions. Just prior to beginning execution of a current instruction, the destination queue is checked for conflict between the source specifiers of the current instruction and the destination specifiers of previously issued but not yet retired instructions. When an instruction is issued for execution, its destination specifiers in the destination queue are marked to indicate that they are associated with an executed but not yet retired instruction. In a preferred construction, each entry of the queue has a "write pending" bit that is cleared during a flush and when a read pointer is incremented. An issue pointer identifies the entry of an instruction next to be issued, so that the write-pending bit is set when the issue pointer is incremented. Each entry has two comparators enabled by the write-pending bit to detect a conflict with two source specifiers.
摘要翻译：在流水线数字计算机中，指令解码器从多个指令解码寄存器说明符，并将它们存储在源队列和目的地队列中。执行单元从源队列连续获得指令的源说明符，发起对源说明符的操作，从目的地队列读取目的地说明符，并在指定的目的地退出结果。可能发生写后冲突，因为执行单元可能与多个指令的执行重叠。在开始执行当前指令之前，检查目的地队列在当前指令的源说明符与先前发布但尚未退出的指令的目标说明符之间的冲突。当执行指令时，目标队列中的目标说明符被标记为指示它们与被执行但尚未退出的指令相关联。在优选的结构中，队列的每个条目具有在刷新期间和读取指针递增时被清除的“写入挂起”位。一个问题指针标识下一个要发出的指令的条目，以便当发出指针递增时，写入挂起位置1。每个条目都有两个比较器由写入挂起位使能，以检测与两个源说明符的冲突。

43. 发明授权

US09436625B2 Approach for allocating virtual bank managers within a dynamic random access memory (DRAM) controller to physical banks within a DRAM 有权
标题翻译：将动态随机存取存储器（DRAM）控制器内的虚拟存储体管理器分配给DRAM内的物理存储体的方法
公开(公告)号：US09436625B2
公开(公告)日：2016-09-06
申请号：US13517534
申请日：2012-06-13
申请人： Shu-Yi Yu , Ram Gummadi , John H. Edmondson
发明人： Shu-Yi Yu , Ram Gummadi , John H. Edmondson
IPC分类号： G06F12/00 , G06F13/16 , G06F12/06 , G06F12/10 , G06F9/50
CPC分类号： G06F13/1668 , G06F9/5077 , G06F12/0653 , G06F12/10 , G06F2212/1028 , Y02D10/13 , Y02D10/14
摘要： Banks within a dynamic random access memory (DRAM) are managed with virtual bank managers. A DRAM controller receives a new memory access request to DRAM including a plurality of banks. If the request accesses a location in DRAM where no virtual bank manager includes parameters for the corresponding DRAM page, then a virtual bank manager is allocated to the physical bank associated with the DRAM page. The bank manager is initialized to include parameters needed by the DRAM controller to access the DRAM page. The memory access request is then processed using the parameters associated with the virtual bank manager. One advantage of the disclosed technique is that the banks of a DRAM module are controlled with fewer bank managers than in previous DRAM controller designs. As a result, less surface area on the DRAM controller circuit is dedicated to bank managers.
摘要翻译：动态随机存取存储器（DRAM）中的银行由虚拟银行经理管理。 DRAM控制器接收对包括多个存储体的DRAM的新的存储器访问请求。如果请求访问DRAM中没有虚拟存储体管理器包括相应的DRAM页面的参数的位置，则将虚拟存储体管理器分配给与DRAM页面相关联的物理存储体。银行经理被初始化以包括DRAM控制器所需的参数来访问DRAM页面。然后使用与虚拟银行管理器相关联的参数来处理存储器访问请求。所公开的技术的一个优点在于，DRAM模块的组由前面的DRAM控制器设计中的银行管理器来控制。因此，DRAM控制器电路上的较少的表面积专用于银行经理。

44. 发明授权

US09274985B2 Approach for allocating virtual bank managers within a dynamic random access memory (DRAM) controller to physical banks within a DRAM 有权
公开(公告)号：US09274985B2
公开(公告)日：2016-03-01
申请号：US13517534
申请日：2012-06-13
申请人： Shu-Yi Yu , Ram Gummadi , John H. Edmondson
发明人： Shu-Yi Yu , Ram Gummadi , John H. Edmondson
IPC分类号： G06F12/00 , G06F13/16 , G06F12/06 , G06F12/10 , G06F9/50
摘要： Banks within a dynamic random access memory (DRAM) are managed with virtual bank managers. A DRAM controller receives a new memory access request to DRAM including a plurality of banks. If the request accesses a location in DRAM where no virtual bank manager includes parameters for the corresponding DRAM page, then a virtual bank manager is allocated to the physical bank associated with the DRAM page. The bank manager is initialized to include parameters needed by the DRAM controller to access the DRAM page. The memory access request is then processed using the parameters associated with the virtual bank manager. One advantage of the disclosed technique is that the banks of a DRAM module are controlled with fewer bank managers than in previous DRAM controller designs. As a result, less surface area on the DRAM controller circuit is dedicated to bank managers.

45. 发明授权

US08375163B1 Supporting late DRAM bank hits 有权
标题翻译：支持晚期DRAM银行点击
公开(公告)号：US08375163B1
公开(公告)日：2013-02-12
申请号：US12326060
申请日：2008-12-01
申请人： John H. Edmondson , Shane Keil
发明人： John H. Edmondson , Shane Keil
IPC分类号： G06F12/00 , G06F13/00 , G06F13/28
CPC分类号： G06F13/28
摘要： One embodiment of the invention sets forth a mechanism to transmit commands received from an L2 cache to a bank page within the DRAM. An arbiter unit determines which commands from a command sorter to transmit to a command queue. An activate command associated with the bank page related to the commands is also transmitted to an activate queue. The last command in the command queue is marked as “last.” An interlock counter stores a count of “last” commands in the read/write command queue. A DRAM controller transmits activate and commands from the activate queue and the command queue to the DRAM. Each time a command marked as “last” is encountered, the DRAM controller decrements the interlock counter. If the count in the interlock counter is zero, then the command marked as “last” is marked as “auto-precharge.” The “auto-precharge” command, when processed, causes the bank page to be closed.
摘要翻译：本发明的一个实施例提出了一种将从L2高速缓存接收的命令发送到DRAM内的存储体页面的机制。仲裁器单元确定哪些命令从命令分拣机发送到命令队列。与该命令相关联的存储体页面的激活命令也被发送到激活队列。命令队列中的最后一个命令被标记为last。互锁计数器存储读/写命令队列中最后命令的计数。 DRAM控制器将激活和命令从激活队列和命令队列传送到DRAM。每次遇到标记为最后的命令时，DRAM控制器递减联锁计数器。如果联锁计数器中的计数为零，则标记为最后的命令被标记为自动预充电。自动预充电命令在处理时会导致银行页面被关闭。

46. 发明授权

US08321618B1 Managing conflicts on shared L2 bus 有权
标题翻译：管理共享L2总线上的冲突
公开(公告)号：US08321618B1
公开(公告)日：2012-11-27
申请号：US12510987
申请日：2009-07-28
申请人： Shane Keil , John H. Edmondson
发明人： Shane Keil , John H. Edmondson
IPC分类号： G06F13/00
CPC分类号： G06F13/1605 , G06F12/0859
摘要： One embodiment of the present invention sets forth a mechanism to schedule read data transmissions and write data transmissions to/from a cache to frame buffer logic on the L2 bus. When processing a read or a write command, a scheduling arbiter examines a bus schedule to determine that a read-read conflict, a read-write conflict or a write-read exists, and allocates an available memory space in a read buffer to store the read data causing the conflict until the read return data transmission can be scheduled. In the case of a write command, the scheduling arbiter then transmits a write request to a request buffer. When processing a write request, the request arbiter examines the request buffers to determine whether a write-write conflict. If so, then the request arbiter allocates a memory space in a request buffer to store the write request until the write data transmission can be scheduled.
摘要翻译：本发明的一个实施例提出了一种机制，用于在L2总线上调度读取数据传输和向高速缓存写入数据传输到帧缓冲器逻辑。在处理读取或写入命令时，调度仲裁器检查总线调度以确定存在读取冲突，读取冲突或写入读取，并且在读取缓冲器中分配可用的存储器空间以存储读取导致冲突的数据，直到可以调度读取返回数据传输。在写命令的情况下，调度仲裁器然后向请求缓冲器发送写请求。在处理写入请求时，请求仲裁器检查请求缓冲区以确定是否写入写入冲突。如果是，则请求仲裁器在请求缓冲器中分配存储空间以存储写入请求，直到可以调度写入数据传输。

47. 发明授权

US08279231B1 Bandwidth impedance matching and starvation avoidance by read completion buffer allocation 有权
标题翻译：带宽阻抗匹配和通过读取完成缓冲区分配避免饥饿
公开(公告)号：US08279231B1
公开(公告)日：2012-10-02
申请号：US12260985
申请日：2008-10-29
申请人： Samuel Hammond Duncan , John H. Edmondson , Raymond Hoi Man Wong , Lukito Muliadi
发明人： Samuel Hammond Duncan , John H. Edmondson , Raymond Hoi Man Wong , Lukito Muliadi
IPC分类号： G06F12/02
CPC分类号： G06F12/1081
摘要： Read completion buffer space is allocated in accordance with a preset limit. When a read request is received from a client, the sum of a current allocation of the read completion buffer space and a new allocation of the read completion buffer space required by the read request is compared with the preset limit. If the preset limit is not exceeded, read completion buffer space is allocated to the read request. If the preset limit is exceeded, the read request is suspended until sufficient data is read out from the read completion buffer.
摘要翻译：读取完成缓冲区空间按照预设限制进行分配。当从客户端接收到读取请求时，读取完成缓冲区空间的当前分配和读取请求所需的读取完成缓冲区空间的新分配与预设限制进行比较。如果未超过预设限制，则读取完成缓冲区空间被分配给读请求。如果超出预设限制，读取请求将被暂停，直到从读取完成缓冲区读出足够的数据。

48. 发明授权

US07986327B1 Systems for efficient retrieval from tiled memory surface to linear memory display 有权
标题翻译：用于从平铺记忆表面到线性记忆体显示的高效检索系统
公开(公告)号：US07986327B1
公开(公告)日：2011-07-26
申请号：US11552082
申请日：2006-10-23
申请人： John H. Edmondson
发明人： John H. Edmondson
IPC分类号： G06F12/10 , G06F13/00 , G06F13/28 , G06F9/26 , G06F9/34
CPC分类号： G09G5/395 , G09G5/363 , G09G2350/00 , G09G2360/122
摘要： Embodiments of the present invention set forth a technique for optimizing the on-chip data path between a memory controller and a display controller within a graphics processing unit (GPU). A row selection field and a sector mask are included within a memory access command transmitted from the display controller to the memory controller indicating which row of data is being requested from memory. The memory controller responds to the memory access command by returning only the row of data corresponding to the requested row to the display controller over the on-chip data path. Any extraneous data received by the memory controller in the process of accessing the specifically requested row of data is stripped out and not transmitted back to the display controller. One advantage of the present invention is that the width of the on-chip data path can be reduced by a factor of two or more as a result of the greater operational efficiency gained by stripping out extraneous data before transmitting the data to the display controller.
摘要翻译：本发明的实施例提出了一种用于优化图形处理单元（GPU）内的存储器控制器和显示控制器之间的片上数据路径的技术。从显示控制器向存储器控制器发送的指示从存储器请求哪一行数据的存储器访问命令中包括行选择字段和扇区掩码。存储器控制器通过仅通过片上数据路径仅将与所请求的行相对应的数据行返回到显示控制器来响应存储器访问命令。在访问特定请求的数据行的过程中由存储器控制器接收的任何无关数据被剥离并且不被传送回显示控制器。本发明的一个优点在于，由于在将数据发送到显示控制器之前剥离外来数据而获得更大的操作效率，片上数据路径的宽度可以减少2倍或更多。

49. 发明申请

US20080109613A1 Page stream sorter for poor locality access patterns 有权
标题翻译：页面流排序器用于不良的局部访问模式
公开(公告)号：US20080109613A1
公开(公告)日：2008-05-08
申请号：US11592540
申请日：2006-11-03
申请人： David A. Jarosh , Sonny S. Yeoh , Colyn S. Case , John H. Edmondson
发明人： David A. Jarosh , Sonny S. Yeoh , Colyn S. Case , John H. Edmondson
IPC分类号： G06F12/00 , G06F17/00
CPC分类号： G06F13/1626
摘要： In some applications, such as video motion compression processing for example, a request pattern or “stream” of requests for accesses to memory (e.g., DRAM) may have, over a large number of requests, a relatively small number of requests to the same page. Due to the small number of requests to the same page, conventionally sorting to aggregate page hits may not be very effective. Reordering the stream can be used to “bury” or “hide” much of the necessary precharge/activate time, which can have a highly positive impact on overall throughput. For example, separating accesses to different rows of the same bank by at least a predetermined number of clocks can effectively hide the overhead involved in precharging/activating the rows.
摘要翻译：在一些应用中，例如视频运动压缩处理，例如，对存储器（例如，DRAM）访问的请求的请求模式或“流”可以在大量请求中具有相对较少数量的请求页。由于对同一页面的请求数量不多，常规排序以汇总页面命中可能不是很有效。重新排序流可以用于“埋葬”或“隐藏”大量必要的预充/激活时间，这对整体吞吐量可能产生很大的积极影响。例如，将对相同存储体的不同行的访问分离至少预定数量的时钟可以有效地隐藏预充电/激活行所涉及的开销。

50. 发明授权

US5278783A Fast area-efficient multi-bit binary adder with low fan-out signals 失效
标题翻译：具有低扇出信号的快速区域效率的多位二进制加法器
公开(公告)号：US5278783A
公开(公告)日：1994-01-11
申请号：US969124
申请日：1992-10-30
申请人： John H. Edmondson
发明人： John H. Edmondson
IPC分类号： G06F7/50 , G06F7/508
CPC分类号： G06F7/508 , G06F2207/5063
摘要： A carry look-ahead adder obtains high speed with minimum gate fan-in and a regular array of area-efficient logic cells in a datapath by including a first row of propagate-generate bit cells, a second row of block-propagate bit cells generating a hierarchy of block-propagate and block-generate bits, a third row of carry bit cells: and a bottom level of sum bit cells. The second row of block-propagate bit cells supply the block-propagate and block-generate bits to the first carry bit cells in chained segments of carry bit cells. In a preferred embodiment for a 32-bit complementary metal-oxide semiconductor (CMOS) adder, the logic gates are limited to a fan-in of three, and the block-propagate bit cells in the second row are interconnected to form two binary trees, each including fifteen cells, and the carry cells are chained in segments including up to four cells. In general, the interconnections between the block-propagate bit cells are derived from a graph which is optimized to meet the constraints of fast static complementary metal-oxide-semiconductor (CMOS) circuit design: low fan-out and small capacitance load on most signals. Sufficient gain stages are present in the binary trees to build-up to a large drive capability where the large drive capability is needed.
摘要翻译：携带前视加法器通过包括第一行传播生成位单元，在数据通路中获得具有最小栅极扇入和区域有效逻辑单元的规则阵列的高速度，第二行块传播位单元生成块传播和块生成位的层次，进位位单元的第三行和和位位单元的底层。第二行块传播位单元将块传播和块生成位提供给进位位单元的链接段中的第一进位位单元。在用于32位互补金属氧化物半导体（CMOS）加法器的优选实施例中，逻辑门被限制为三个扇形，并且第二行中的块传播位单元被互连以形成两个二叉树，每个包括十五个小区，并且携带单元被链接到包括多达四个小区的段中。通常，块传播比特单元之间的互连是从图形中导出的，该曲线被优化以满足快速静态互补金属氧化物半导体（CMOS）电路设计的约束：大多数信号的低扇出和小电容负载。在二叉树中存在足够的增益级，以构建需要大驱动能力的大型驱动能力。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式