会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 3. 发明授权
    • Efficient context saving and restoring in a multi-tasking computing
system environment
    • 在多任务计算系统环境中高效的上下文保存和恢复
    • US06061711A
    • 2000-05-09
    • US699280
    • 1996-08-19
    • Seungyoon Peter SongMoataz A. MohamedHeonchul ParkLe T. NguyenJerry R. Van AkenAlessandro ForinAndrew R. Raffman
    • Seungyoon Peter SongMoataz A. MohamedHeonchul ParkLe T. NguyenJerry R. Van AkenAlessandro ForinAndrew R. Raffman
    • G06F9/44G06F9/30G06F9/38G06F9/46G06F9/48G06F17/16
    • G06F9/30087G06F9/30003G06F9/30036G06F9/30043G06F9/3009G06F9/3836G06F9/3861G06F9/3877G06F9/3887G06F9/461
    • In a multi-tasking computing system environment, one program is halted and context switched out so that a processor may context switch in a subsequent program for execution. Processor state information exists which reflects the state of the program being context switched out. Storage of this processor state information permits successful resumption of the context switched out program. When the context switched out program is subsequently context switched in, the stored processor information is loaded in preparation for successfully resuming the program at the point in which execution was previously halted. Although, large areas of memory can be allocated to processor state information storage, only a portion of this may need to be preserved across a context switch for successfully saving and resuming the context switched out program. Unnecessarily saving and loading all available processor state information can be noticeably inefficient particularly where relatively large amounts of processor state information exists. In one embodiment, a processor requests a co-processor to context switch out the currently executing program. At a predetermined appropriate point in the executing program, the co-processor responds by halting program execution and saving only the minimal amount of processor state information necessary for successful restoration of the program. The appropriate point is chosen by the application programmer at a location in the executing program that requires preserving a minimal portion of the processor information across a context switch. By saving only a minimal amount of processor information, processor time savings are accumulated across context save and restoration operations.
    • 在多任务计算系统环境中,停止一个程序并上下文切换,使得处理器可以在后续程序中上下文切换以执行。 存在反映正在上下文切换的程序的状态的处理器状态信息。 该处理器状态信息的存储允许成功恢复上下文切换程序。 当上下文切换程序随后进行上下文切换时,加载所存储的处理器信息以准备好在先前停止执行的点成功恢复程序。 尽管可以将大面积的存储器分配给处理器状态信息存储,但是只有一部分可能需要在上下文切换中被保留以成功地保存和恢复上下文切换程序。 不必要地保存和加载所有可用的处理器状态信息,特别是在存在相对大量的处理器状态信息的情况下是显着的。 在一个实施例中,处理器请求协处理器上下文切换当前执行的程序。 在执行程序中的预定的适当点处,协处理器通过停止程序执行并且仅节省成功恢复程序所需的最小量的处理器状态信息来进行响应。 应用程序员在执行程序中需要在上下文切换中保留处理器信息的最小部分的位置来选择适当的点。 通过仅节省最少量的处理器信息,可以在上下文保存和恢复操作中累积处理器时间节省。
    • 4. 发明授权
    • Scalable width vector processor architecture for efficient emulation
    • 可扩展宽度向量处理器架构,实现高效仿真
    • US5991531A
    • 1999-11-23
    • US804765
    • 1997-02-24
    • Seungyoon Peter SongHeonchul Park
    • Seungyoon Peter SongHeonchul Park
    • G06F9/06G06F9/302G06F9/318G06F9/455
    • G06F9/30014G06F9/30032G06F9/30036G06F9/3017
    • A N-byte vector processor is provided which can emulate 2N-byte processor operations by executing two N-byte operations sequentially. By using N-byte architecture to process 2N-byte wide data, chip size and costs are reduced. One embodiment allows 64-byte operations to be implemented with a 32-byte vector processor by executing a 32-byte instruction on the first 32-bytes of data and then executing a 32-byte instruction on the second 32-bytes of data. Registers and instructions for 64-byte operation are emulated using two 32-byte registers and instructions, respectively, with some instructions requiring modification to accommodate 64-byte operations between adjacent elements, operations requiring specific element locations, operations shifting elements in and out of registers, and operations specifying addresses exceeding 32 bytes.
    • 提供一个N字节向量处理器,可以通过依次执行两个N字节操作来模拟2N字节的处理器操作。 通过使用N字节架构处理2N字节的宽数据,芯片尺寸和成本降低。 一个实施例允许通过在前32个字节的数据上执行32字节指令,然后在第二个32字节数据上执行32字节指令,通过32字节向量处理器实现64字节操作。 64字节操作的寄存器和指令分别使用两个32字节寄存器和指令进行仿真,其中一些指令需要修改以适应相邻元件之间的64字节操作,需要特定元件位置的操作,将元件输入和输出寄存器 ,以及指定地址超过32个字节的操作。
    • 5. 发明授权
    • Deferred store data read with simple anti-dependency pipeline inter-lock
control in superscalar processor
    • 在超标量处理器中使用简单的反依赖管道互锁控制读取延迟存储数据
    • US5881307A
    • 1999-03-09
    • US805389
    • 1997-02-24
    • Heonchul ParkSeungyoon Peter Song
    • Heonchul ParkSeungyoon Peter Song
    • G06F9/38G06F9/40G06F9/30
    • G06F9/3816G06F9/3834G06F9/3836G06F9/3838G06F9/384G06F9/3857G06F9/3867G06F9/3885
    • A superscalar processor includes an execution unit that executes load/store instructions and an execution unit that executes arithmetic instruction. Execution pipelines for both execution units include a decode stage, a read stage that identify and read source operands for the instructions and an execution stage or stages performed in the execution units. For store instructions, reading store data from a register file is deferred until the store data is required for transfer to a memory system. This allows the store instructions to be decoded simultaneously with earlier instructions that generate the store data. A simple antidependency interlock uses a list of the register numbers identifying registers holding store data for pending store instructions. These register number are compared to the register numbers of destination operands of instructions, and instructions having destination operands matching a source of store data are stalled in the read stage to prevent the instruction from destroying store data before an earlier store instruction is complete.
    • 超标量处理器包括执行加载/存储指令的执行单元和执行算术指令的执行单元。 两个执行单元的执行流水线包括解码阶段,用于识别和读取指令的源操作数的读取阶段以及在执行单元中执行的执行阶段或阶段。 对于存储指令,从寄存器文件读取存储数据将被延迟,直到存储数据需要传输到存储器系统为止。 这允许存储指令与生成存储数据的先前指令同时解码。 一个简单的反依赖联锁使用寄存器编号列表来识别用于挂起存储指令的存储数据的寄存器。 将这些寄存器编号与指令的目标操作数的寄存器编号进行比较,并且具有与存储数据源匹配的目标操作数的指令在读取阶段停止,以防止指令在较早的存储指令完成之前破坏存储数据。
    • 7. 发明授权
    • Processor that decodes a multi-cycle instruction into single-cycle
micro-instructions and schedules execution of the micro-instructions
    • 将多周期指令解码为单周期微指令并计划执行微指令的处理器
    • US5923862A
    • 1999-07-13
    • US789574
    • 1997-01-28
    • Le Trong NguyenHeonchul Park
    • Le Trong NguyenHeonchul Park
    • G06F9/22G06F9/28G06F9/30G06F9/318G06F9/38
    • G06F9/3017G06F9/28G06F9/30145G06F9/30167G06F9/3836G06F9/3838G06F9/3857
    • An instruction decoder in a processor decodes an instruction by creating a decode buffer entry that includes global fields, operand fields, and a set of micro-instructions. Each micro-instruction represent an operation that an associated execution unit can execute in a single clock cycle. A scheduler issues the micro-instructions from one or more entries to the execution units for possible parallel and out-of-order execution. Each execution unit completes an operation, typically, in one clock cycle and does not monitor instructions that may block a pipeline. The execution units do not need separate decoding for multiple stages. One global field indicates which micro-instructions are execute first. Further, micro-instructions have fields that indicate an execution sequence. The scheduler issues operations in the order indicated by the global fields and the micro-instructions. When the last operation for an instruction is completed, the instruction is retired and removed from the decode buffer.
    • 处理器中的指令解码器通过创建包括全局字段,操作数字段和一组微指令的解码缓冲器条目来解码指令。 每个微指令表示相关执行单元可以在单个时钟周期内执行的操作。 调度器将微指令从一个或多个条目发送到执行单元,以实现可能的并行和无序执行。 每个执行单元通常在一个时钟周期内完成一个操作,并且不监视可能阻塞流水线的指令。 执行单元不需要对多个阶段进行单独的解码。 一个全局字段指示哪个微指令首先执行。 此外,微指令具有指示执行顺序的字段。 调度器按照全局字段和微指令指示的顺序发布操作。 当指令的最后一个操作完成时,指令被退出并从解码缓冲器中移除。
    • 8. 发明授权
    • Instruction fetch unit including instruction buffer and secondary or
branch target buffer that transfers prefetched instructions to the
instruction buffer
    • 指令提取单元包括指令缓冲器和将预取指令传送到指令缓冲器的辅助或分支目标缓冲器
    • US5889986A
    • 1999-03-30
    • US790028
    • 1997-01-28
    • Le Trong NguyenHeonchul Park
    • Le Trong NguyenHeonchul Park
    • G06F9/38G06F9/06
    • G06F9/3806G06F9/3804
    • An instruction fetch unit includes a program buffer for sequential instructions being decoded and a target buffer for an instruction sequence including the target of the next branch instruction. Scan logic coupled to the program buffer scans the program buffer for branch instructions. A target for the first branch instruction is determined and a request to external memory fills the target buffer with a sequence of instructions including a target instruction before sequential decoding reaches the branch instruction. If the branch is subsequently taken, the instructions from the branch target buffer are transferred to the program buffer. The program buffer may be divided into a main and a secondary buffer that have the same size as the target buffer, and an instruction bus between the instruction fetch unit and external memory is sufficiently wide to fill the main, secondary, or target buffer in a single write operation.
    • 指令提取单元包括用于正在解码的顺序指令的程序缓冲器和用于包括下一个分支指令的目标的指令序列的目标缓冲器。 耦合到程序缓冲区的扫描逻辑扫描程序缓冲区以获得分支指令。 确定第一分支指令的目标,并且对外部存储器的请求在序列解码到达分支指令之前用包括目标指令的指令序列填充目标缓冲器。 如果随后采取分支,则来自分支目标缓冲器的指令被传送到程序缓冲器。 程序缓冲器可以被划分为与目标缓冲器具有相同大小的主缓冲器和辅助缓冲器,并且指令提取单元和外部存储器之间的指令总线足够宽以填充主缓冲器,辅助缓冲器或目标缓冲器 单写操作。
    • 9. 发明授权
    • Single chip design for fast image compression
    • 单芯片设计,用于快速图像压缩
    • US5468069A
    • 1995-11-21
    • US100928
    • 1993-08-03
    • Viktor K. PrasannaCho-Li WangHeonchul Park
    • Viktor K. PrasannaCho-Li WangHeonchul Park
    • G06T9/00G06K9/36G06K9/68
    • G06T9/008
    • Video data compression techniques reduce necessary storage size and communication channel bandwidth while maintaining acceptable fidelity. Vector quantization provides better overall data compression performance by coding vectors instead of scalars. The search algorithm and VLSI architecture for implementing it is herein disclosed, and such a search algorithm is useful for real-time image processing. The architecture employs a single processing element and external memory for storing the N constant value hyperplanes used in the search, where N is the number of codevectors. The design does not perform any multiplication operation using the constant value hyperplane tree search, since the tree search method is independent of any L.sub.q metric for q between one and infinity. Memory used by the design is significantly less than memory employed in existing architecture.
    • 视频数据压缩技术在保持可接受的保真度的同时减少必要的存储大小和通信信道带宽。 矢量量化通过编码矢量而不是标量来提供更好的总体数据压缩性能。 这里公开了用于实现它的搜索算法和VLSI架构,并且这种搜索算法对于实时图像处理是有用的。 该架构采用单个处理元件和外部存储器来存储在搜索中使用的N个恒定值超平面,其中N是代码矢量的数量。 该设计不使用常数值超平面树搜索来执行任何乘法运算,因为树搜索方法独立于一个和无穷大之间的q的任何Lq度量。 设计使用的内存明显小于现有架构中使用的内存。