会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 3. 发明授权
    • Register allocation method and apparatus for truncating runaway
lifetimes of program variables in a computer system
    • 用于截断计算机系统中程序变量失控寿命的寄存器分配方法和装置
    • US5761514A
    • 1998-06-02
    • US522052
    • 1995-08-31
    • Nava Arela AizikowitzRoy Bar-HaimEdward Curtis ProsserRobert Ralph RoedigerWilliam Jon Schmidt
    • Nava Arela AizikowitzRoy Bar-HaimEdward Curtis ProsserRobert Ralph RoedigerWilliam Jon Schmidt
    • G06F9/45
    • G06F8/433G06F8/441
    • A method and apparatus for truncating runaway lifetimes of program variables calculates liveness for each variable based on upwardly exposed uses. Reaching definitions are then calculated for at least the program variables that have runaway lifetimes. The liveness information is compared to the reaching definition information to determine whether a variable that is live upon entry to a basic block has a definition that reaches the end of each predecessor block, or has a use within the basic block. If the reaching definition for a variable reaches the beginning of the block and if there is a predecessor block for which there is no reaching definition, the variable has a runaway lifetime. The variable also has a runaway lifetime if there is a use of the variable in a block without a reaching definition for the variable at the beginning of the block. The runaway lifetime is truncated by inserting an instruction such as a pseudo-definition of the variable into the instruction stream at an appropriate place. Once runaway lifetimes are truncated using this method, subsequent stages of the compiler may calculate liveness by performing a single dataflow analysis which calculates lifetimes based on upwardly exposed uses.
    • 用于截断程序变量的失效寿命的方法和装置根据向上暴露的使用来计算每个变量的活动性。 然后对至少具有失效寿命的程序变量计算达到定义。 将活动信息与达到的定义信息进行比较,以确定在输入到基本块时是否存在活动的变量具有到达每个前导块的结束的定义,或者在基本块内具有使用。 如果变量的到达定义到达块的开头,并且如果存在没有到达定义的前导块,则该变量具有失控的生命周期。 如果块中的变量使用块的开头处的变量没有达到定义,那么该变量也将失效。 通过在适当的地方将诸如伪变量的伪定义的指令插入指令流来截断失控生命周期。 一旦使用这种方法截断失效生命周期,编译器的后续阶段可以通过执行基于向上暴露的使用计算寿命的单个数据流分析来计算活动。
    • 4. 发明授权
    • Instruction cache alignment mechanism for branch targets based on predicted execution frequencies
    • 基于预测执行频率的分支目标的指令缓存对齐机制
    • US06301652B1
    • 2001-10-09
    • US08593309
    • 1996-01-31
    • Edward Curtis ProsserRobert Ralph RoedigerWilliam Jon Schmidt
    • Edward Curtis ProsserRobert Ralph RoedigerWilliam Jon Schmidt
    • G06F940
    • G06F8/4442
    • A compiler system and method is provided that can 1) generate a second instruction stream from a first instruction stream, 2) read in and process predetermined external information regarding the basic blocks that makes up the second instruction stream and 3) place certain of the basic blocks on cache line boundaries based on predicted execution frequencies. In particular, the compiler system and method utilize profile information containing predicted block execution or edge-weight execution frequencies to determine which of the basic blocks to align on cache line boundaries. One method for obtaining profile information includes precompiling the source code, creating an executable program, executing the program with test inputs, and outputting a profile containing execution frequency information. Once the profile information is obtained, the source code can then be recompiled using the profile information. The compiler can then selectively cache align those blocks identified as important.
    • 提供一种编译器系统和方法,其可以1)从第一指令流生成第二指令流,2)读入并处理关于组成第二指令流的基本块的预定外部信息,以及3)将某些基本 基于预测的执行频率在高速缓存线边界上的块。 特别地,编译器系统和方法利用包含预测块执行或边缘权重执行频率的简档信息来确定哪些基本块在高速缓存行边界上对齐。 用于获得简档信息的一种方法包括预编译源代码,创建可执行程序,使用测试输入执行程序,以及输出包含执行频率信息的简档。 一旦获得了简档信息,就可以使用简档信息重新编译源代码。 然后,编译器可以选择性地高速缓存将被标识为重要的块。
    • 7. 发明授权
    • Compiler apparatus and method for optimizing loops in a computer program
    • 用于优化计算机程序中的循环的编译器装置和方法
    • US06938249B2
    • 2005-08-30
    • US09992324
    • 2001-11-19
    • Robert Ralph RoedigerWilliam Jon Schmidt
    • Robert Ralph RoedigerWilliam Jon Schmidt
    • G06F9/45G06F11/34
    • G06F11/3466G06F8/443G06F2201/865
    • A profile-based loop optimizer generates an execution frequency table for each loop that gives more detailed profile data that allows making a more intelligent decision regarding if and how to optimize each loop in the computer program. The execution frequency table contains entries that correlate a number of times a loop is executed each time the loop is entered with a count of the occurrences of each number during the execution of an instrumented instruction stream. The execution frequency table is used to determine whether there is one dominant mode that appears in the profile data, and if so, optimizes the loop according to the dominant mode. The optimizer may perform optimizations by peeling a loop, by unrolling a loop, and by performing both peeling and unrolling on a loop according to the profile data in the execution frequency table for the loop. In this manner the execution time of the resulting code is minimized according to the detailed profile data in the execution frequency tables, resulting in a computer program with loops that are more fully optimized.
    • 基于配置文件的循环优化器为每个循环生成执行频率表,以提供更详细的配置文件数据,从而可以对计算机程序中的每个循环是否以及如何优化。 执行频率表包含将在每次循环输入时执行循环的次数与执行被测试指令流期间每个数字的出现次数相关联的条目。 执行频率表用于确定在配置文件数据中是否存在一个主要模式,如果是,则根据主导模式优化循环。 优化器可以通过剥离循环,展开循环,以及根据循环的执行频率表中的轮廓数据在循环上执行剥离和展开来执行优化。 以这种方式,根据执行频率表中的详细简档数据,最终得到的代码的执行时间最小化,从而导致具有更完全优化的循环的计算机程序。
    • 8. 发明授权
    • Method and apparatus for modular reordering of portions of a computer
program based on profile data
    • 基于简档数据对计算机程序的部分进行模块化重排序的方法和装置
    • US6029004A
    • 2000-02-22
    • US819526
    • 1997-03-17
    • Vita BortnikovBilha MendelsonMark NovickRobert Ralph RoedigerWilliam Jon SchmidtInbal Shavit-Lottem
    • Vita BortnikovBilha MendelsonMark NovickRobert Ralph RoedigerWilliam Jon SchmidtInbal Shavit-Lottem
    • G06F9/45G06F9/44
    • G06F8/445
    • An apparatus and method reorder portions of a computer program in a way that achieves both enhanced performance and maintainability of the computer program. A global call graph is initially constructed that includes profile data. From the information in the global call graph, an intramodular call graph is generated for each module. Reordering techniques are used to reorder the procedures in each module according to the profile data in each intramodular call graph. An intermodular call graph is generated from the information in the global call graph. Reordering techniques are used to reorder the modules in the computer program. By reordering procedures within modules, then reordering the modules, enhanced performance is achieved without reordering procedures across module boundaries. Respecting module boundaries enhances the maintainability of the computer program by allowing a module to be replaced without adversely affecting the other modules while still providing many of the advantages of global procedure reordering.
    • 一种装置和方法以实现计算机程序的增强的性能和可维护性的方式重新排序计算机程序的部分。 最初构建包括配置文件数据的全局调用图。 从全局调用图中的信息,为每个模块生成一个集体内调用图。 根据每个模块间调用图中的配置文件数据,重新排序技术用于对每个模块中的过程重新排序。 从全局调用图中的信息生成一个多模式调用图。 重新排序技术用于重新排序计算机程序中的模块。 通过重新排序模块中的过程,然后重新排序模块,实现增强的性能,而无需跨模块边界重新排序过程。 尊重模块边界通过允许更换模块而不会对其他模块产生不利影响,从而提高计算机程序的可维护性,同时仍然提供全局过程重新排序的许多优点。
    • 10. 发明授权
    • Incorporating register pressure into an inlining compiler
    • 将寄存器压力并入到内联编译器中
    • US06983459B1
    • 2006-01-03
    • US09286862
    • 1999-04-06
    • Edward Curtis ProsserWilliam Jon Schmidt
    • Edward Curtis ProsserWilliam Jon Schmidt
    • G06F9/45
    • G06F8/4443
    • A method, system, and program product for optimizing compilation. In the preferred embodiment, a compiler compiles a source-code file twice; once to gather register-pressure data, and a second time to apply the data. Thus, the compiler saves register-pressure data during the first compilation and uses it during the second compilation to make informed inlining decisions. The compiler saves two kinds of data during the first compilation: (1) the maximum register-pressure occurring in each procedure; and (2) within each procedure, the register pressure at each call site that is a potential inlining candidate. This data is then fed into the compiler during the second compilation. The compiler uses the data during the second compilation in two ways. First, when deciding whether to inline a child procedure into a parent procedure, the compiler determines whether the sum of the maximum register-pressure and the site register-pressure exceeds the number of available, physical registers. If so, the inlining is not done. Otherwise, inlining is permitted subject to other heuristics. Second, if the child procedure is chosen for inlining into the parent procedure, the maximum register-pressure of the parent procedure is set to be the maximum of its existing value or the sum of the maximum register-pressure of the child procedure and the site register-pressure. This assures that later consideration of the parent procedure for inlining into another procedure can be done with accurate register-pressure data available.
    • 一种用于优化编译的方法,系统和程序产品。 在优选实施例中,编译器编译源代码文件两次; 一次收集注册压力数据,并第二次应用数据。 因此,编译器在第一次编译期间保存注册表压力数据,并在第二次编译期间使用它来作出明确的内联决策。 编译器在第一次编译时保存两种数据:(1)每个过程中发生的最大寄存器压力; 和(2)在每个程序中,每个呼叫站点的注册压力是潜在的内联候选人。 然后在第二次编译期间将该数据提供给编译器。 编译器在第二次编译期间使用数据有两种方式。 首先,当决定是否将子程序嵌入到父程序中时,编译器确定最大寄存器压力和站点寄存器压力的总和是否超过可用的物理寄存器的数量。 如果是这样,内联没有完成。 否则,允许内联使用其他启发式。 第二,如果选择子程序来嵌入父级程序,父级程序的最大注册压力被设置为其现有值的最大值或子程序与站点的最大注册压力之和 记录压力。 这样做可以使用精确的寄存器压力数据来进行后续审核。