会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 32. 发明授权
    • Screen compression
    • 屏幕压缩
    • US07965895B1
    • 2011-06-21
    • US11837336
    • 2007-08-10
    • John M. DanskinZiyad S. HakuraEdward L. RiegelsbergerJason M. MusicerStephen D. Lew
    • John M. DanskinZiyad S. HakuraEdward L. RiegelsbergerJason M. MusicerStephen D. Lew
    • G06K9/36G06K9/46
    • G09G5/397G09G5/393G09G2340/02G09G2340/12G09G2360/122H04N19/428H04N19/593
    • Methods, circuits, and apparatus for reducing memory bandwidth used by a graphics processor. Uncompressed tiles are read from a display buffer portion of a graphics memory and received by an encoder. The uncompressed tiles are compressed and written back to the graphics memory. When a tile is needed again before it has been modified, the compressed version is read from memory, uncompressed, and displayed. To reduce the number of unnecessary writes of compressed tiles to memory, a tile is only written to memory if it has remained static for some number of refresh cycles. Also, to prevent a large number of compressed tiles being written to the display buffer in one refresh cycle, the encoder can be throttled after a number of tiles have been written. Validity information can be stored for use by a CRTC. If a tile is updated, the validity information is updated such that invalid compressed data is not read from memory and displayed.
    • 用于减少由图形处理器使用的存储器带宽的方法,电路和装置。 未压缩的瓦片从图形存储器的显示缓冲器部分读取并由编码器接收。 未压缩的瓦片被压缩并写回图形存储器。 在修改瓦片之前,再次需要一个瓦片时,从内存中读取压缩版本,解压缩并显示。 为了将压缩瓦片的不必要的写入数量减少到存储器,如果在一些刷新周期内保持静态,则瓦片仅写入存储器。 此外,为了防止在一个刷新周期中将大量的压缩瓦片写入显示缓冲器,编码器可以在写入多个瓦片之后被节流。 有效信息可以存储供CRTC使用。 如果更新瓦片,则更新有效性信息,使得无法从存储器读取无效的压缩数据并显示。
    • 35. 发明授权
    • Counter-based delay of dependent thread group execution
    • 依赖线程组执行的基于计数器的延迟
    • US07526634B1
    • 2009-04-28
    • US11535871
    • 2006-09-27
    • Jerome F. Duluk, Jr.Stephen D. LewJohn R. Nickolls
    • Jerome F. Duluk, Jr.Stephen D. LewJohn R. Nickolls
    • G06F9/40
    • G06F9/52G06F9/546G06F2209/548
    • Systems and methods for synchronizing processing work performed by threads, cooperative thread arrays (CTAs), or “sets” of CTAs. A central processing unit can load launch commands for a first set of CTAs and a second set of CTAs in a pushbuffer, and specify a dependency of the second set upon completion of execution of the first set. A parallel or graphics processor (GPU) can autonomously execute the first set of CTAs and delay execution of the second set of CTAs until the first set of CTAs is complete. In some embodiments the GPU may determine that a third set of CTAs is not dependent upon the first set, and may launch the third set of CTAs while the second set of CTAs is delayed. In this manner, the GPU may execute launch commands out of order with respect to the order of the launch commands in the pushbuffer.
    • 由线程执行的处理工作同步的系统和方法,协同线程数组(CIA)或CTA的“集合”。 中央处理单元可以加载针对第一组CTA和第二组CTA的推送命令,并且在第一组的执行完成时指定第二组的依赖关系。 并行或图形处理器(GPU)可以自主地执行第一组CTA并且延迟第二组CTA的执行,直到第一组CTA完成。 在一些实施例中,GPU可以确定第三组CTA不依赖于第一组,并且可以启动第三组CTA,同时第二组CTA被延迟。 以这种方式,GPU可以相对于推送缓冲器中的发射命令的顺序执行命令无序。
    • 39. 发明授权
    • Digital media processor
    • 数字媒体处理器
    • US08253750B1
    • 2012-08-28
    • US12832830
    • 2010-07-08
    • Jen-Hsun HuangGerrit A. SlavenburgStephen D. LewJohn C. SchaferThomas F. FoxTaner E. Ozcelik
    • Jen-Hsun HuangGerrit A. SlavenburgStephen D. LewJohn C. SchaferThomas F. FoxTaner E. Ozcelik
    • G06F13/14G06F15/80G06F15/00
    • G06T15/005G06F15/78G09G5/003G09G2360/02H04N13/161
    • Circuits, methods, and apparatus that provide highly integrated digital media processors for digital consumer electronics applications. These digital media processors are capable of performing the parallel processing of multiple format audio, video, and graphics signals. In one embodiment, audio and video signals may be received from a variety of input devices or appliances, such as antennas, VCRs, DVDs, and networked devices such as camcorders and modems, while output audio and video signals may be provided to output devices such as televisions, monitors, and networked devices such as printers and networked video recorders. Another embodiment of the present invention interfaces with a variety of devices such as navigation, entertainment, safety, memory, and networking devices. This embodiment can also be configured for use in a digital TV, set-top box, or home server. In this configuration, video and audio streams may be received from a number of cable, satellite, Internet, and consumer devices.
    • 为数字消费电子应用提供高度集成的数字媒体处理器的电路,方法和设备。 这些数字媒体处理器能够执行多格式音频,视频和图形信号的并行处理。 在一个实施例中,音频和视频信号可以从诸如天线,VCR,DVD以及诸如摄像机和调制解调器之类的网络设备的各种输入设备或设备接收,而输出音频和视频信号可以被提供给诸如 作为电视机,显示器和网络设备,如打印机和网络录像机。 本发明的另一实施例与诸如导航,娱乐,安全,存储器和网络设备的各种设备接口。 该实施例还可以被配置为用于数字电视,机顶盒或家庭服务器中。 在该配置中,可以从多个有线,卫星,因特网和消费者设备接收视频和音频流。
    • 40. 发明授权
    • Methods for scalably exploiting parallelism in a parallel processing system
    • 在并行处理系统中可扩展地利用并行性的方法
    • US08099584B2
    • 2012-01-17
    • US13099035
    • 2011-05-02
    • John R. NickollsStephen D. Lew
    • John R. NickollsStephen D. Lew
    • G06F9/30
    • G06F9/3851G06F9/30072G06F9/3012G06F9/3889G06F9/5066
    • Parallelism in a parallel processing subsystem is exploited in a scalable manner. A problem to be solved can be hierarchically decomposed into at least two levels of sub-problems. Individual threads of program execution are defined to solve the lowest-level sub-problems. The threads are grouped into one or more thread arrays, each of which solves a higher-level sub-problem. The thread arrays are executable by processing cores, each of which can execute at least one thread array at a time. Thread arrays can be grouped into grids of independent thread arrays, which solve still higher-level sub-problems or an entire problem. Thread arrays within a grid, or entire grids, can be distributed across all of the available processing cores as available in a particular system implementation.
    • 并行处理子系统中的并行性以可扩展的方式被利用。 要解决的问题可以被分层分解成至少两个级别的子问题。 定义程序执行的各个线程来解决最低级别的问题。 线程被分组成一个或多个线程数组,每个线程数组都解决了较高级的子问题。 线程数组可以通过处理内核执行,每个核心可以一次执行至少一个线程数组。 线程数组可以分组成独立线程数组的网格,从而解决更高级的子问题或整个问题。 网格中的线程数组或整个网格可以分布在所有可用处理核心中,如特定系统实现中可用的。