会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 41. 发明授权
    • Efficient code generation using loop peeling for SIMD loop code with multile misaligned statements
    • 使用多重不对齐语句的SIMD循环码循环剥离进行有效的代码生成
    • US08171464B2
    • 2012-05-01
    • US12122050
    • 2008-05-16
    • Alexandre E. EichenbergerKai-Ting Amy WangPeng Wu
    • Alexandre E. EichenbergerKai-Ting Amy WangPeng Wu
    • G06F9/45G06F15/00
    • G06F8/447G06F8/4441
    • An approach is provided for vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores. In this framework, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirements of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residual iteration counts, and multiple statements with arbitrary alignment combinations. Loop peeling is used to reduce the computational overhead associated with misaligned data. A loop prologue and epilogue are peeled from individual iterations in the simdized loop, and vector-splicing instructions are applied to the peeled iterations, while the steady-state loop body incurs no additional computational overhead.
    • 提供了一种方法,用于在仅支持对齐加载和存储的SIMD架构的编译代码中向量化未对齐的引用。 在这个框架中,循环首先被模拟,就好像内存单元没有对齐约束。 编译器然后插入数据重组操作以满足硬件的实际对齐要求。 最后,代码生成算法基于数据重组图生成SIMD代码,解决诸如运行时对齐,未知循环边界,残差迭代计数以及具有任意对齐组合的多个语句之类的现实问题。 循环剥离用于减少与未对齐数据相关的计算开销。 循环序言和结语在模拟循环中从单独迭代中去除,向量拼接指令被应用于剥离的迭代,而稳态循环体不引起额外的计算开销。
    • 45. 发明申请
    • SIMD Code Generation For Loops With Mixed Data Lengths
    • 具有混合数据长度的循环的SIMD代码生成
    • US20090144529A1
    • 2009-06-04
    • US12328730
    • 2008-12-04
    • Alexandre E. EichenbergerKai-Ting Amy WangPeng Wu
    • Alexandre E. EichenbergerKai-Ting Amy WangPeng Wu
    • G06F9/00G06F9/45
    • G06F8/4452
    • Generating loop code to execute on Single-Instruction Multiple-Datapath (SIMD) architectures, where the loop operates on datatypes having different lengths, is disclosed. Further, a preferred embodiment of the present invention includes a novel technique to efficiently realign or shift arbitrary streams to an arbitrary offset, regardless whether the alignments or offsets are known at the compile time or not. This technique enables the application of advanced alignment optimizations to runtime alignment. Length conversion operations, for packing and unpacking data values, are included in the alignment handling framework. These operations are formally defined in terms of standard SIMD instructions that are readily available on various SIMD platforms. This allows sequential loop code operating on datatypes of disparate length to be transformed (“simdized”) into optimized SIMD code through a fully automated process.
    • 公开了在单指令多数据路径(SIMD)架构中生成循环码,其循环对具有不同长度的数据类型进行操作。 此外,本发明的优选实施例包括一种用于有效地将任意流重新对准或将任意流移动到任意偏移的新技术,无论在编译时是否知道对准或偏移。 这种技术使得可以将高级对齐优化应用于运行时对齐。 用于打包和解包数据值的长度转换操作包含在对齐处理框架中。 这些操作根据在各种SIMD平台上容易获得的标准SIMD指令正式定义。 这允许对具有不同长度的数据类型的顺序循环代码通过完全自动化的过程进行转换(“模拟化”)成优化的SIMD代码。
    • 48. 发明授权
    • MWW type zeolite substance, precursor substance therefor, and process for producing these substances
    • MWW型沸石物质及其前体物质,以及这些物质的制造方法
    • US07326401B2
    • 2008-02-05
    • US10506532
    • 2003-02-26
    • Takashi TatsumiPeng WuKatsuyuki Tsuji
    • Takashi TatsumiPeng WuKatsuyuki Tsuji
    • C01B39/04C01B39/06B01J29/06
    • C01B39/08C01B37/005C01B39/085C01B39/48
    • A process for easily synthesizing a zeolite substance containing an element having a large ionic radius in the framework at a high ratio. This process comprises the following first to fourth steps:First Step: a step of heating a mixture containing a template compound, a compound containing a Group 13 element of the periodic table, a silicon-containing compound and water to obtain a precursor (A); Second Step: a step of acid-treating the precursor (A) obtained in the first step; Third Step: a step of heating the acid-treated precursor (A) obtained in the second step together with a mixture containing a template compound and water to obtain a precursor (B); and Fourth Step: a step of calcining the precursor (B) obtained in the third step to obtain a zeolite substance.
    • 一种在框架中以高比率容易地合成含有离子半径大的元素的沸石物质的方法。 该方法包括以下第一至第四步骤:第一步骤:加热含有模板化合物,含有周期表第13族元素的化合物,含硅化合物和水的混合物以获得前体(A)的步骤, ; 第二工序:对第一工序中得到的前体(A)进行酸处理的工序; 第三工序:将第二工序得到的酸处理前体(A)和含有模板化合物和水的混合物一起加热,得到前体(B)的工序; 和第四步骤:煅烧在第三步骤中获得的前体(B)以获得沸石物质的步骤。
    • 49. 发明申请
    • Method for improving processing of relatively aligned memory references for increased reuse opportunities
    • 用于改善相对一致的存储器引用的处理以增加再利用机会的方法
    • US20070226453A1
    • 2007-09-27
    • US11387218
    • 2006-03-23
    • Alexandre EichenbergerRohini NairKai-Ting WangPeng WuPeng Zhao
    • Alexandre EichenbergerRohini NairKai-Ting WangPeng WuPeng Zhao
    • G06F15/00
    • G06F9/3885G06F9/30036G06F9/345G06F9/383
    • Computer implemented method, system and computer program product for aligning vectors to be processed by SIMD code. A pair of vectors to be aligned at runtime and having a known relative alignment at compile time is identified. A modified second memory reference is generated by modifying an address of the second memory reference to be in a same congruence class as the first memory reference, wherein the congruence class is mod V and wherein V is SIMD byte width. A first SIMD load located at the modified second memory reference and a next adjacent SIMD load located at a third memory reference corresponding to the modified second memory reference address plus V are loaded, and the first SIMD load and the next adjacent SIMD load are concatenated to generate a resultant vector of length 2V. The resultant vector is left shifted by an amount corresponding to a difference between the addresses of the first memory reference and the second memory reference mod V, and the leftmost V bytes of the resultant vector are retained to align the first and second vectors.
    • 计算机实现的方法,系统和计算机程序产品,用于对齐由SIMD代码处理的向量。 识别在运行时对准并且在编译时具有已知的相对对准的一对向量。 通过将第二存储器引用的地址修改为与第一存储器引用相同的一致类来生成修改的第二存储器引用,其中,同余类是mod V,并且其中V是SIMD字节宽度。 位于修改的第二存储器基准的第一SIMD负载和位于与修改的第二存储器参考地址加V相对应的第三存储器引用的下一相邻SIMD负载被加载,并且第一SIMD负载和下一相邻SIMD负载被级联到 产生长度为2V的合成矢量。 将所得到的矢量移位一个与第一存储器参考和第二存储器参考模V的地址之间的差相对应的量,并且保留所得到的矢量的最左侧的V字节以对齐第一和第二矢量。
    • 50. 发明申请
    • Computer-implemented method, system, and program product for deployment time optimization of a distributed application
    • 计算机实现的方法,系统和程序产品,用于分布式应用程序的部署时间优化
    • US20070198973A1
    • 2007-08-23
    • US11345748
    • 2006-02-02
    • Jong-Deok ChoiManish GuptaParviz KermaniKang-Won LeeKyung RyuDinesh VermaPeng Wu
    • Jong-Deok ChoiManish GuptaParviz KermaniKang-Won LeeKyung RyuDinesh VermaPeng Wu
    • G06F9/45
    • G06F8/61
    • A computer-implemented method, system, and program product for optimizing a distributed (software) application are provided. Specifically, a configuration of a target computing environment, in which the distributed application is deployed, is discovered upon deployment of the distributed application. Thereafter, based on a set of rules and the discovered configuration, one or more optimization techniques are applied to optimize the distributed application. In a typical embodiment, the set of rules can be embedded in the distributed application, or they can be accessed from an external source such as a repository. Regardless, the optimization techniques applied can include at least one of the following: (1) identification and replacement of an underperforming component of the distributed application with a new component; (2) generation of interface layers (to allow selection of optimal bindings) between distributed objects of the distributed application; and/or (3) execution of code transformation of the distributed application using program analysis techniques.
    • 提供了一种用于优化分布式(软件)应用程序的计算机实现的方法,系统和程序产品。 具体而言,在部署分布式应用程序时发现其中部署了分布式应用的目标计算环境的配置。 此后,基于一组规则和所发现的配置,应用一个或多个优化技术来优化分布式应用。 在典型的实施例中,该组规则可以嵌入在分布式应用中,或者可以从诸如存储库的外部源访问它们。 无论如何,应用的优化技术可以包括以下至少一个:(1)用新的组件识别和替换分布式应用程序的表现不佳的组件; (2)在分布式应用程序的分布式对象之间生成界面层(允许选择最佳绑定); 和/或(3)使用程序分析技术执行分布式应用程序的代码转换。