会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Determination of rules by providing data records in columnar data structures
    • 通过在柱状数据结构中提供数据记录来确定规则
    • US08671111B2
    • 2014-03-11
    • US13465210
    • 2012-05-07
    • Patrick DantressangleEberhard HechlerMartin OberhoferMichael Wurst
    • Patrick DantressangleEberhard HechlerMartin OberhoferMichael Wurst
    • G06F17/30
    • G06F17/30321G06F17/30315
    • A method includes providing a columnar database comprising a plurality of columnar data structures associated with one column attribute; providing first data records having a plurality of first attribute-value pairs comprising counting information indicative of a number of first data records having the respective first attribute-value pair; providing mask data structures comprising one or more second attribute-value pairs; selecting second data records by intersecting the columnar data structures and the mask data structures; selecting one of the column attributes and one value contained in the column data structure associated with said selected column attribute as the destination attribute-value pair; creating one second rule for each first attribute-value pair; calculating, for each second rule, a co-occurrence-count between its respective source attribute-value pair and its destination attribute-value pair; and specifically selecting one or more of said second rules as the first rules in dependence on the calculated co-occurrence-count.
    • 一种方法包括提供一个包括与一个列属性相关联的多个列数据结构的列数据库; 提供具有多个第一属性值对的第一数据记录,所述第一属性值对包括表示具有相应的第一属性值对的第一数据记录的数量的计数信息; 提供包括一个或多个第二属性值对的掩码数据结构; 通过与列数据结构和掩模数据结构相交来选择第二数据记录; 选择所述列属性之一和包含在与所述列列属性相关联的列数据结构中的一个值作为目标属性值对; 为每个第一个属性值对创建一个第二个规则; 对于每个第二规则,计算其各自的源属性值对与其目的地属性值对之间的共同计数; 并且具体地根据所计算的同现计数来选择所述第二规则中的一个或多个作为第一规则。
    • 2. 发明申请
    • PROBABILISTIC DATA MINING MODEL COMPARISON
    • 概率数据挖掘模型比较
    • US20120084251A1
    • 2012-04-05
    • US13214105
    • 2011-08-19
    • Christoph LINGENFELDERPascal POMPEYMichael WURST
    • Christoph LINGENFELDERPascal POMPEYMichael WURST
    • G06F7/00G06F17/00
    • G06F17/18G06K9/62
    • A first data mining model and a second data mining model are compared. A first data mining model M1 represents results of a first data mining task on a first data set D1 and provides a set of first prediction values. A second data mining model M2 represents results of a second data mining task on a second data set D2 and provides a set of second prediction values. A relation R is determined between said sets of prediction values. For at least a first record of an input data set, a first and second probability distribution is created based on the first and second data mining models applied to the first record. A distance measure d is calculated for said first record using the first and second probability distributions and the relation. At least one region of interest is determined based on said distance measure d.
    • 比较了第一个数据挖掘模型和第二个数据挖掘模型。 第一数据挖掘模型M1表示第一数据集D1上的第一数据挖掘任务的结果,并提供一组第一预测值。 第二数据挖掘模型M2表示第二数据集D2上的第二数据挖掘任务的结果,并提供一组第二预测值。 在所述预测值组之间确定关系R. 对于输入数据集的至少第一记录,基于应用于第一记录的第一和第二数据挖掘模型来创建第一和第二概率分布。 使用第一和第二概率分布以及关系针对所述第一记录计算距离度量d。 基于所述距离测量d确定至少一个感兴趣区域。
    • 3. 发明申请
    • Selective Storing of Mining Models for Enabling Interactive Data Mining
    • 挖掘模型的选择性存储以实现交互式数据挖掘
    • US20110153664A1
    • 2011-06-23
    • US12951542
    • 2010-11-22
    • Alexander LangBernhard MitschangRuben Pulido de los ReyesChristoph SiebMichael Wurst
    • Alexander LangBernhard MitschangRuben Pulido de los ReyesChristoph SiebMichael Wurst
    • G06F17/30
    • G06F17/30286
    • Computerized methods, data processing systems, and computer program products for storing of data mining models (DMMs) are provided. A new DMM is created having at least one of the following characteristics: quality and complexity. The new DMM is handled as a candidate for storing in a storage device if a predefined criterion for the characteristics is met. The sum of the sizes of the new DMM and already stored DMMs is determined In response to the sum falling below a storage limit, the new DMM is stored in the storage device. In response to the sum exceeding the storage limit, a decision is taken based on priorities of the DMMs which DMMs to store in the storage device. The priorities depend at least on access frequencies of the DMMs. Upon a data mining request, a corresponding DMM is determined and a user is requested to confirm that data mining is to proceed if quality of the determined DMM does not fulfill a further predefined criterion.
    • 提供了用于存储数据挖掘模型(DMM)的计算机化方法,数据处理系统和计算机程序产品。 创建具有以下特征中的至少一个的新的DMM:质量和复杂性。 如果满足特征的预定义标准,则新DMM被作为存储在存储设备中的候选者来处理。 新DMM和已存储的DMM的大小的总和被确定为响应于低于存储限制的总和,新的DMM被存储在存储设备中。 响应于超过存储限制的总和,基于存储在存储设备中的DMM的优先级进行决定。 优先级至少取决于DMM的访问频率。 在数据挖掘请求时,确定对应的DMM,并且如果所确定的DMM的质量不满足另外的预定准则,则请求用户确认数据挖掘将继续进行。
    • 4. 发明授权
    • Probabilistic data mining model comparison
    • 概率数据挖掘模型比较
    • US08990145B2
    • 2015-03-24
    • US13214105
    • 2011-08-19
    • Christoph LingenfelderPascal PompeyMichael Wurst
    • Christoph LingenfelderPascal PompeyMichael Wurst
    • G06F17/30G06F17/18G06K9/62
    • G06F17/18G06K9/62
    • A first data mining model and a second data mining model are compared. A first data mining model M1 represents results of a first data mining task on a first data set D1 and provides a set of first prediction values. A second data mining model M2 represents results of a second data mining task on a second data set D2 and provides a set of second prediction values. A relation R is determined between said sets of prediction values. For at least a first record of an input data set, a first and second probability distribution is created based on the first and second data mining models applied to the first record. A distance measure d is calculated for said first record using the first and second probability distributions and the relation. At least one region of interest is determined based on said distance measure d.
    • 比较了第一个数据挖掘模型和第二个数据挖掘模型。 第一数据挖掘模型M1表示第一数据集D1上的第一数据挖掘任务的结果,并提供一组第一预测值。 第二数据挖掘模型M2表示第二数据集D2上的第二数据挖掘任务的结果,并提供一组第二预测值。 在所述预测值组之间确定关系R. 对于输入数据集的至少第一记录,基于应用于第一记录的第一和第二数据挖掘模型来创建第一和第二概率分布。 使用第一和第二概率分布以及关系针对所述第一记录计算距离度量d。 基于所述距离测量d确定至少一个感兴趣区域。
    • 5. 发明授权
    • Predictive modeling
    • 预测建模
    • US08738549B2
    • 2014-05-27
    • US13214097
    • 2011-08-19
    • Christoph LingenfelderPascal PompeyMichael Wurst
    • Christoph LingenfelderPascal PompeyMichael Wurst
    • G06N5/00
    • G06N7/005G06F17/18G06K9/6256G06K9/6277
    • A predictive analysis generates a predictive model (Padj(Y|X)) based on two separate pieces of information, a set of original training data (Dorig), and a “true” distribution of indicators (Ptrue(X)). The predictive analysis begins by generating a base model distribution (Pgen(Y|X)) from the original training data set (Dorig) containing tuples (x,y) of indicators (x) and corresponding labels (y). Using the “true” distribution (Ptrue(X)) of indicators, a random data set (D′) of indicator records (x) is generated reflecting this “true” distribution (Ptrue(X)). Subsequently, the base model (Pgen(Y|X)) is applied to said random data set (D′), thus assigning a label (y) or a distribution of labels to each indicator record (x) in said random data set (D′) and generating an adjusted training set (Dadj). Finally, an adjusted predictive model (Padj(Y|X)) is trained based on said adjusted training set (Dadj).
    • 预测分析基于两个单独的信息,一组原始训练数据(Dorig)和“真实”指标分布(Ptrue(X))生成预测模型(Padj(Y | X))。 预测分析从包含指示符(x)和相应标签(y)的元组(x,y)的原始训练数据集(Dorig)生成基本模型分布(Pgen(Y | X))开始。 使用指示符的“真”分布(Ptrue(X)),产生反映该“真”分布(Ptrue(X))的指示符记录(x)的随机数据集(D')。 随后,将基本模型(Pgen(Y | X))应用于所述随机数据集(D'),从而将标签(y)或标签分布分配给所述随机数据集中的每个指示符记录(x) D')并生成调整训练集(Dadj)。 最后,基于所述调整训练集(Dadj)来训练调整后的预测模型(Padj(Y | X))。
    • 7. 发明申请
    • PREDICTIVE MODELING
    • 预测建模
    • US20120158624A1
    • 2012-06-21
    • US13214097
    • 2011-08-19
    • Christoph LINGENFELDERPascal POMPEYMichael WURST
    • Christoph LINGENFELDERPascal POMPEYMichael WURST
    • G06N5/00
    • G06N7/005G06F17/18G06K9/6256G06K9/6277
    • A predictive analysis generates a predictive model (Padj(Y|X)) based on two separate pieces of information, a set of original training data (Dorig), and a “true” distribution of indicators (Ptrue(X)). The predictive analysis begins by generating a base model distribution (Pgen(Y|X)) from the original training data set (Dorig) containing tuples (x,y) of indicators (x) and corresponding labels (y). Using the “true” distribution (Ptrue(X)) of indicators, a random data set (D′) of indicator records (x) is generated reflecting this “true” distribution (Ptrue(X)). Subsequently, the base model (Pgen(Y|X)) is applied to said random data set (D′), thus assigning a label (y) or a distribution of labels to each indicator record (x) in said random data set (D′) and generating an adjusted training set (Dadj). Finally, an adjusted predictive model (Padj(Y|X)) is trained based on said adjusted training set (Dadj).
    • 预测分析基于两个单独的信息,一组原始训练数据(Dorig)和“真实”指标分布(Ptrue(X))生成预测模型(Padj(Y | X))。 预测分析从包含指示符(x)和相应标签(y)的元组(x,y)的原始训练数据集(Dorig)生成基本模型分布(Pgen(Y | X))开始。 使用指示符的“真”分布(Ptrue(X)),产生反映该“真”分布(Ptrue(X))的指示符记录(x)的随机数据集(D')。 随后,将基本模型(Pgen(Y | X))应用于所述随机数据集(D'),从而将标签(y)或标签分布分配给所述随机数据集中的每个指示符记录(x) D')并生成调整训练集(Dadj)。 最后,基于所述调整训练集(Dadj)来训练调整后的预测模型(Padj(Y | X))。