会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 3. 发明申请
    • EFFICIENT COLUMN BASED DATA ENCODING FOR LARGE-SCALE DATA STORAGE
    • 基于高效列的数据编码用于大规模数据存储
    • WO2010014956A2
    • 2010-02-04
    • PCT/US2009/052491
    • 2009-07-31
    • MICROSOFT CORPORATION
    • NETZ, AmirPETCULESCU, CristianCRIVAT, Ioan, Bogdan
    • G06F7/76G06F7/78
    • G06F17/30501G06F17/30315H03M7/30H03M7/48
    • The subject disclosure relates to column based data encoding where raw data to be compressed is organized by columns, and then, as first and second layers of reduction of the data size, dictionary encoding and/or value encoding are applied to the data as organized by columns, to create integer sequences that correspond to the columns. Next, a hybrid greedy run length encoding and bit packing compression algorithm further compacts the data according to an analysis of bit savings. Synergy of the hybrid data reduction techniques in concert with the column-based organization, coupled with gains in scanning and querying efficiency owing to the representation of the compact data, results in substantially improved data compression at a fraction of the cost of conventional systems.
    • 本发明涉及基于列的数据编码,其中要压缩的原始数据按列进行组织,然后作为数据大小减小的第一和第二层,执行字典编码和/或值编码 应用于按列组织的数据,以创建与列对应的整数序列。 接下来,根据位节省的分析,混合贪婪游程长度编码和比特封装压缩算法进一步压缩数据。 混合数据简化技术与基于列的组织的协同作用,以及由于紧凑数据的表示而导致的扫描和查询效率的增益,以常规系统的一小部分成本实现了显着改进的数据压缩。 / p>
    • 4. 发明申请
    • EFFICIENT LARGE-SCALE JOINING FOR QUERYING OF COLUMN BASED DATA ENCODED STRUCTURES
    • 有效的大规模连接用于基于柱的数据编码结构的查询
    • WO2010039895A2
    • 2010-04-08
    • PCT/US2009/059114
    • 2009-09-30
    • MICROSOFT CORPORATION
    • PETCULESCU, CristianNETZ, Amir
    • G06F17/30G06F17/00
    • G06F17/3048G06F17/30315G06F17/30498
    • The subject disclosure relates to querying of column based data encoded structures enabling efficient query processing over large scale data storage, and more specifically, with respect to join operations. Initially, a compact structure is received that represents the data according to a column based organization, and various compression and data packing techniques, already enabling a highly efficient and fast query response in real-time. On top of already fast querying enabled by the compact column oriented structure, a scalable, fast algorithm is provided for query processing in memory, which constructs an auxiliary data structure, also column-oriented, for use in join operations, which further leverages characteristics of in-memory data processing and access, as well as the column-oriented characteristics of the compact data structure.
    • 本主题公开涉及对基于列的数据编码结构的查询,该数据编码结构能够在大规模数据存储上实现高效的查询处理,并且更具体地涉及联合操作。 最初,根据基于列的组织来接收代表数据的紧凑结构,以及各种压缩和数据打包技术,这些技术已经实现了高效且快速的实时查询响应。 除了紧凑的列导向结构提供的快速查询之外,还为内存中的查询处理提供了一种可扩展的快速算法,该算法构建了一个面向列的辅助数据结构,用于连接操作,进一步利用 内存中的数据处理和访问,以及紧凑型数据结构的列式特征。
    • 5. 发明申请
    • EFFICIENT LARGE-SCALE PROCESSING OF COLUMN BASED DATA ENCODED STRUCTURES
    • 基于柱的数据编码结构的高效大规模处理
    • WO2010014955A2
    • 2010-02-04
    • PCT/US2009/052490
    • 2009-07-31
    • MICROSOFT CORPORATION
    • NETZ, AmirPETCULESCU, Cristian
    • G06F17/30G06F17/00
    • G06F17/30492
    • The subject disclosure relates to efficient query processing over large scale data storage. An exemplary process includes retrieving a subset of columns implicated by a query as integer encoded and compressed sequences of values corresponding to different columns of data, defining query processing buckets that span over the subset of columns based on changes of compression type occurring in the integer encoded and compressed sequences of values of the subset of data and processing the query in memory on a bucket by bucket basis and processing the query based on type of current bucket when processing the integer encoded and compressed sequences of values. The column based organization of the data, and the application of a hybrid run length encoding and bit packing technique, enable a highly efficient and speedy query response in real-time.
    • 本主题公开涉及对大规模数据存储的高效查询处理。 一种示例性过程包括:将由查询牵连的列的子集作为对应于不同数据列的值的整数编码和压缩序列进行检索,基于在整数编码中出现的压缩类型的变化来定义遍及列的子集的查询处理桶 并且压缩数据子集的值序列,并且在存储器中逐个桶地处理查询,并且在处理整数编码和压缩的值序列时基于当前桶的类型来处理查询。 基于列的数据组织,以及混合游程编码和位打包技术的应用,可实现高效且快速的实时查询响应。