会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Variance-optimal sampling-based estimation of subset sums
    • 基于方差最优采样的子集合估计
    • US08005949B2
    • 2011-08-23
    • US12325340
    • 2008-12-01
    • Nicholas DuffieldCarsten LundMikkel ThorupEdith CohenHaim Kaplan
    • Nicholas DuffieldCarsten LundMikkel ThorupEdith CohenHaim Kaplan
    • G06F15/173
    • G06F17/18H04L41/142H04L43/024H04L43/16
    • The present invention relates to a method of obtaining a generic sample of an input stream. The method is designated as VAROPTk. The method comprises receiving an input stream of items arriving one at a time, and maintaining a sample S of items i. The sample S has a capacity for at most k items i. The sample S is filled with k items i. An nth item i is received. It is determined whether the nth item i should be included in sample S. If the nth item i is included in sample S, then a previously included item i is dropped from sample S. The determination is made based on weights of items without distinguishing between previously included items i and the nth item i. The determination is implemented thereby updating weights of items i in sample S. The method is repeated until no more items are received.
    • 本发明涉及一种获得输入流的通用样本的方法。 该方法被指定为VAROPTk。 该方法包括一次接收一个物品的输入流,并且保持项目i的样本S. 样本S具有最多k个项目i的容量。 样本S填充有k个项目i。 收到第n项。 确定第n个项目i是否应该包含在样本S中。如果第n个项目i包括在样本S中,则先前包括的项目i从样本S中丢弃。根据项目的权重进行确定,而不区分 以前包括项目i和第n项目i。 由此实现确定,从而更新样本S中的项目i的权重。重复该方法,直到不再收到项目。
    • 4. 发明申请
    • Method for summarizing data in unaggregated data streams
    • 用于汇总未分组数据流中的数据的方法
    • US20110153554A1
    • 2011-06-23
    • US12653831
    • 2009-12-18
    • Edith CohenNicholas DuffieldHaim KaplanCarsten LundMikkel Thorup
    • Edith CohenNicholas DuffieldHaim KaplanCarsten LundMikkel Thorup
    • G06F17/30
    • H04L43/028H04L43/04
    • A method for producing a summary A of data points in an unaggregated data stream wherein the data points are in the form of weighted keys (a, w) where a is a key and w is a weight, and the summary is a sample of k keys a with adjusted weights wa. A first reservoir L includes keys having adjusted weights which are additions of weights of individual data points of included keys and a second reservoir T includes keys having adjusted weights which are each equal to a threshold value τ whose value is adjusted based upon tests of new data points arriving in the data stream. The summary combines the keys and adjusted weights of the first reservoir L with the keys and adjusted weights of the second reservoir T to form the sample representing the data stream upon which further analysis may be performed. The method proceeds by first merging new data points in the stream into the reservoir L until the reservoir contains k different keys and thereafter applying a series of tests to new arriving data points to determine what keys and weights are to be added to or removed the reservoirs L and T to provide a summary with a variance that approaches the minimum possible for aggregated data sets. The method is composable, can be applied to high speed data streams such as those found on the Internet, and can be implemented efficiently.
    • 一种用于产生未聚集数据流中的数据点的摘要A的方法,其中数据点是加权密钥(a,w)的形式,其中a是密钥,w是权重,并且摘要是k的样本 键a与调整权重wa。 第一储存器L包括具有调整权重的密钥,这些密钥是附加密钥的各个数据点的加权的加法,而第二储存器T包括具有调整的权重的密钥,其各自等于基于新数据的测试来调整其值的阈值τ 到达数据流的点。 总结将第一储层L的密钥和调整的权重与密钥和第二储存器T的调整权重组合,以形成表示可以进行进一步分析的数据流的样本。 该方法通过首先将流中的新数据点合并到储存器L中,直到储存器包含k个不同的密钥,然后对新的到达数据点应用一系列测试,以确定要添加到或移除存储器的哪些密钥和权重 L和T提供一个总结,其方差接近汇总数据集的最小可能性。 该方法是可组合的,可以应用于诸如在因特网上发现的高速数据流,并且可以有效地实现。
    • 6. 发明授权
    • Method for summarizing data in unaggregated data streams
    • 用于汇总未分组数据流中的数据的方法
    • US08195710B2
    • 2012-06-05
    • US12653831
    • 2009-12-18
    • Edith CohenNicholas DuffieldHaim KaplanCarsten LundMikkel Thorup
    • Edith CohenNicholas DuffieldHaim KaplanCarsten LundMikkel Thorup
    • G06F17/00
    • H04L43/028H04L43/04
    • A method for producing a summary A of data points in an unaggregated data stream wherein the data points are in the form of weighted keys (a, w) where a is a key and w is a weight, and the summary is a sample of k keys a with adjusted weights wa. A first reservoir L includes keys having adjusted weights which are additions of weights of individual data points of included keys and a second reservoir T includes keys having adjusted weights which are each equal to a threshold value τ whose value is adjusted based upon tests of new data points arriving in the data stream. The summary combines the keys and adjusted weights of the first reservoir L with the keys and adjusted weights of the second reservoir T to form the sample representing the data stream upon which further analysis may be performed. The method proceeds by first merging new data points in the stream into the reservoir L until the reservoir contains k different keys and thereafter applying a series of tests to new arriving data points to determine what keys and weights are to be added to or removed the reservoirs L and T to provide a summary with a variance that approaches the minimum possible for aggregated data sets. The method is composable, can be applied to high speed data streams such as those found on the Internet, and can be implemented efficiently.
    • 一种用于产生未聚集数据流中的数据点的摘要A的方法,其中数据点是加权密钥(a,w)的形式,其中a是密钥,w是权重,并且摘要是k的样本 键a与调整权重wa。 第一储存器L包括具有调整权重的密钥,这些密钥是附加密钥的各个数据点的加权的加法,而第二储存器T包括具有调整的权重的密钥,其各自等于基于新数据的测试来调整其值的阈值τ 到达数据流的点。 总结将第一储层L的密钥和调整的权重与密钥和第二储存器T的调整权重组合,以形成表示可以进行进一步分析的数据流的样本。 该方法通过首先将流中的新数据点合并到储存器L中,直到储存器包含k个不同的密钥,然后对新的到达数据点应用一系列测试,以确定要添加到或移除存储器的哪些密钥和权重 L和T提供一个总结,其方差接近汇总数据集的最小可能性。 该方法是可组合的,可以应用于诸如在因特网上发现的高速数据流,并且可以有效地实现。