会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 33. 发明授权
    • Floating point conversion for records of multidimensional database
    • 多维数据库记录的浮点转换
    • US06424972B1
    • 2002-07-23
    • US09602610
    • 2000-06-22
    • Alexander BergerAmir NetzCristian Petculescu
    • Alexander BergerAmir NetzCristian Petculescu
    • G06F1730
    • H03M7/24G06F17/30569G06F17/30592Y10S707/99933Y10S707/99942Y10S707/99943
    • A method and system for compressing and decompressing read only data in records that have a fixed size. A plurality of records are divided into segments having a predetermined size. For each segment, the records are arranged in a table with rows for each record and a column for each field in each record. The width of each column of repeated data is compressed to zero bits and the repeated data is referenced in a header of the segment. The width of each column of integer data is compressed to the minimum number of bits required to represent the largest integer value in the fields of the column. Floating point data in each column is converted to integer data and the width of the each column with converted integer data is set to the minimum width necessary to represent the largest converted integer in each column. The conversion to integer data is calculated for floating point and real numbers with a minimum precision exponent that is stored in the header for the segment. Floating point data is cleaned when it is converted to integer data. The information in the header is employed to decompress the compressed records in the segment. The decompression lends itself well to fast random access of secondary storage devices.
    • 用于压缩和解压缩具有固定大小的记录中的只读数据的方法和系统。 多个记录被划分成具有预定大小的段。 对于每个段,记录排列在每个记录的行中,每个记录中每个字段的列。 重复数据的每一列的宽度被压缩到零位,重复的数据被引用在段的标题中。 整列数据的每一列的宽度被压缩到列的字段中表示最大整数值所需的最小位数。 每列中的浮点数据被转换为整数数据,并且具有转换的整数数据的每列的宽度被设置为表示每列中最大转换整数所需的最小宽度。 针对浮点和实数计算转换为整数数据,最小精度指数存储在段的头部。 浮点数据被转换为整数数据时被清除。 头部中的信息用于解压缩片段中的压缩记录。 解压缩适用于二级存储设备的快速随机访问。
    • 34. 发明授权
    • Processing records in dynamic ranges
    • 在动态范围内处理记录
    • US09087094B2
    • 2015-07-21
    • US13092978
    • 2011-04-25
    • Amir NetzCristian Petculescu
    • Amir NetzCristian Petculescu
    • G06F17/30G06F17/00
    • G06F17/30454G06F17/30412
    • A scalable analysis system is described herein that performs common data analysis operations such as distinct counts and data grouping in a more scalable and efficient manner. The system allows distinct counts and data grouping to be applied to large datasets with predictable growth in the cost of the operation. The system dynamically partitions data based on the actual data distribution, which provides both scalability and uncompromised performance. The system sets a budget of available memory or other resources to use for the operation. As the operation progresses, the system determines whether the budget of memory is nearing exhaustion. Upon detecting that the memory used is near the limit, the system dynamically partitions the data. If the system still detects memory pressure, then the system partitions again, until a partition level is identified that fits within the memory budget.
    • 本文描述了可扩展分析系统,其以更可扩展和有效的方式执行诸如不同计数和数据分组之类的共同数据分析操作。 该系统允许将不同的计数和数据分组应用于具有可预测的操作成本增长的大型数据集。 系统根据实际的数据分布动态分割数据,提供了可扩展性和无与伦比的性能。 系统设置可用内存或其他资源的预算用于操作。 随着操作的进行,系统确定存储器的预算是否接近耗尽。 在检测到所使用的内存接近限制时,系统会动态分区数据。 如果系统仍然检测到内存压力,则系统再次分区,直到识别出符合内存预算的分区级别。
    • 37. 发明申请
    • EFFICIENT LARGE-SCALE JOINING FOR QUERYING OF COLUMN BASED DATA ENCODED STRUCTURES
    • 用于查询基于数据的数据编码结构的有效的大规模加工
    • US20100088309A1
    • 2010-04-08
    • US12335341
    • 2008-12-15
    • Cristian PetculescuAmir Netz
    • Cristian PetculescuAmir Netz
    • G06F17/30
    • G06F16/24552G06F16/221G06F16/2456
    • The subject disclosure relates to querying of column based data encoded structures enabling efficient query processing over large scale data storage, and more specifically, with respect to join operations. Initially, a compact structure is received that represents the data according to a column based organization, and various compression and data packing techniques, already enabling a highly efficient and fast query response in real-time. On top of already fast querying enabled by the compact column oriented structure, a scalable, fast algorithm is provided for query processing in memory, which constructs an auxiliary data structure, also column-oriented, for use in join operations, which further leverages characteristics of in-memory data processing and access, as well as the column-oriented characteristics of the compact data structure.
    • 主题公开涉及对基于列的数据编码结构的查询,其能够在大规模数据存储上进行有效的查询处理,更具体地,涉及连接操作。 最初,接收到一个紧凑的结构,它表示根据基于列的组织的数据,以及各种压缩和数据打包技术,已经实现了高效和快速的查询响应。 在紧凑型列导向结构启用的已经快速查询之上,提供了一种可扩展的快速算法,用于存储器中的查询处理,构建了一个辅助数据结构,也是以列为主,用于连接操作,这进一步利用了 内存数据处理和访问,以及紧凑数据结构的面向列的特性。
    • 38. 发明申请
    • EFFICIENT LARGE-SCALE PROCESSING OF COLUMN BASED DATA ENCODED STRUCTURES
    • 基于列的数据编码结构的有效的大规模处理
    • US20100030748A1
    • 2010-02-04
    • US12270872
    • 2008-11-14
    • Amir NetzCristian Petculescu
    • Amir NetzCristian Petculescu
    • G06F7/06G06F17/30
    • G06F17/30492
    • The subject disclosure relates to efficient query processing over large scale data storage. An exemplary process includes retrieving a subset of columns implicated by a query as integer encoded and compressed sequences of values corresponding to different columns of data, defining query processing buckets that span over the subset of columns based on changes of compression type occurring in the integer encoded and compressed sequences of values of the subset of data and processing the query in memory on a bucket by bucket basis and processing the query based on type of current bucket when processing the integer encoded and compressed sequences of values. The column based organization of the data, and the application of a hybrid run length encoding and bit packing technique, enable a highly efficient and speedy query response in real-time.
    • 本公开涉及对大规模数据存储的有效查询处理。 示例性过程包括:将查询所涉及的列的子集作为对应于不同数据列的整数编码和压缩的值序列,基于经整数编码的压缩类型的变化定义跨越列的子集的查询处理桶 以及数据子集的值的压缩序列,并且逐桶地处理存储器中的查询,并且当处理整数编码和压缩的值序列时,基于当前存储桶的类型来处理查询。 数据的基于列的组织以及混合运行长度编码和位打包技术的应用实现了高效和快速的查询响应。
    • 40. 发明授权
    • Efficient column based data encoding for large-scale data storage
    • 高效的基于列的数据编码用于大规模数据存储
    • US08452737B2
    • 2013-05-28
    • US13347367
    • 2012-01-10
    • Amir NetzCristian PetculescuIoan Bogdan Crivat
    • Amir NetzCristian PetculescuIoan Bogdan Crivat
    • G06F17/30
    • G06F17/30501G06F17/30315H03M7/30H03M7/48
    • The subject disclosure relates to column based data encoding where raw data to be compressed is organized by columns, and then, as first and second layers of reduction of the data size, dictionary encoding and/or value encoding are applied to the data as organized by columns, to create integer sequences that correspond to the columns. Next, a hybrid greedy run length encoding and bit packing compression algorithm further compacts the data according to an analysis of bit savings. Synergy of the hybrid data reduction techniques in concert with the column-based organization, coupled with gains in scanning and querying efficiency owing to the representation of the compact data, results in substantially improved data compression at a fraction of the cost of conventional systems.
    • 本公开涉及基于列的数据编码,其中待压缩的原始数据由列组织,然后作为数据大小的第一和第二层缩减,字典编码和/或值编码被应用于由 列,以创建与列相对应的整数序列。 接下来,混合贪婪跑步长度编码和位打包压缩算法根据比特节省的分析进一步压缩数据。 混合数据简化技术与基于列的组织协调一致,加上由于表示紧凑数据而在扫描和查询效率方面的增益,导致数据压缩大大提高了传统系统成本的一小部分。