专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US5719692A Rule induction on large noisy data sets 失效
标题翻译：大噪声数据集上的规则归纳
公开(公告)号：US5719692A
公开(公告)日：1998-02-17
申请号：US499247
申请日：1995-07-07
申请人： William W. Cohen
发明人： William W. Cohen
IPC分类号： G06F9/44 , G06N5/02 , G06N5/04 , G06F17/00 , G06F15/00
CPC分类号： G06N5/025
摘要： Efficient techniques for inducing rules used in classifying data items on a noisy data set. The prior-art IREP technique, which produces a set of classification rules by inducing each rule and then pruning it and continuing thus until a stopping condition is reached, is improved with a new rule-value metric for stopping pruning and with a stopping condition which depends on the description length of the rule set. The rule set which results from the improved IREP technique is then optimized by pruning rules from the set to minimize the description length and further optimized by making a replacement rule and a modified rule for each rule and using the description length to determine whether to use the replacement rule, the modified rule, or the original rule in the rule set. Further improvement is achieved by inducing rules for data items not covered by the original set and then pruning these rules. Still further improvement is gained by repeating the steps of inducing rules for data items not covered, pruning the rules, optimizing the rules, and again pruning for a fixed number of times. The fully-developed technique has the O(nlog.sup.2 n) running time characteristic of IREP, but produces rule sets which do a substantially better job of classification than those produced by IREP.
摘要翻译：用于诱导在嘈杂数据集上分类数据项的规则的高效技术。现有技术的IREP技术通过引导每个规则然后修剪并继续直到停止条件产生一组分类规则，通过用于停止修剪的新的规则值度量和具有停止修剪的停止条件来改进取决于规则集的描述长度。然后，通过从集合中修剪规则来优化来自改进的IREP技术的规则集，以最小化描述长度并通过为每个规则做出替换规则和修改的规则进一步优化，并使用描述长度来确定是否使用替换规则，修改规则或规则集中的原始规则。通过为原始集合未涵盖的数据项引发规则，然后修剪这些规则来实现进一步的改进。通过重复对未包括的数据项进行规则，修剪规则，优化规则，再次修剪固定次数的步骤，可以进一步改进。完全开发的技术具有IREP的O（nlog2n）运行时间特性，但是产生的规则集比IREP生成的分类要好得多。

2. 发明授权

US08538972B1 Context-dependent similarity measurements 有权
标题翻译：上下文相关性相似性测量
公开(公告)号：US08538972B1
公开(公告)日：2013-09-17
申请号：US13532972
申请日：2012-06-26
申请人： William W. Cohen
发明人： William W. Cohen
IPC分类号： G06F7/00 , G06F17/30
CPC分类号： G06F17/30675
摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining similarity measures for objects in a dataset that include contextual associations of the objects with contexts. In one aspect, a method includes calculating a similarity measure for any two objects that include a common feature f based, in part, on the likelihood that the two object representations in the dataset that both include f will we associated with distinct contexts, and the likelihood that the two objects in the dataset that both include f will be associated with the same context.
摘要翻译：方法，系统和装置，包括在计算机存储介质上编码的计算机程序，用于确定包括对象与上下文的上下文关联的数据集中的对象的相似性度量。在一个方面，一种方法包括：计算包括公共特征f的任何两个对象的相似性度量，部分地基于在数据集中包括f的两个对象表示将与不同上下文相关联的可能性，以及数据集中包含f的两个对象将与相同的上下文相关联的可能性。

3. 发明授权

US08234285B1 Context-dependent similarity measurements 有权
标题翻译：上下文相关性相似性测量
公开(公告)号：US08234285B1
公开(公告)日：2012-07-31
申请号：US12506685
申请日：2009-07-21
申请人： William W. Cohen
发明人： William W. Cohen
IPC分类号： G06F7/00 , G06F17/30
CPC分类号： G06F17/30675
摘要： Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining similarity measures for objects in a dataset that include contextual associations of the objects with contexts. In one aspect, a method includes calculating a similarity measure for any two objects that include a common feature f based, in part, on the likelihood that the two object representations in the dataset that both include f will we associated with distinct contexts, and the likelihood that the two objects in the dataset that both include f will be associated with the same context.
摘要翻译：方法，系统和装置，包括在计算机存储介质上编码的计算机程序，用于确定包括对象与上下文的上下文关联的数据集中的对象的相似度度量。在一个方面，一种方法包括：计算包括公共特征f的任何两个对象的相似性度量，部分地基于在数据集中包括f的两个对象表示将与不同上下文相关联的可能性，以及数据集中包含f的两个对象将与相同的上下文相关联的可能性。

4. 发明授权

US06516308B1 Method and apparatus for extracting data from data sources on a network 失效
标题翻译：用于从网络上的数据源提取数据的方法和装置
公开(公告)号：US06516308B1
公开(公告)日：2003-02-04
申请号：US09568145
申请日：2000-05-10
申请人： William W. Cohen
发明人： William W. Cohen
IPC分类号： G06F1518
CPC分类号： G06F17/30569 , G06F17/30902 , Y10S707/99935
摘要： A method and apparatus is provided for producing a general data extraction procedure capable of extracting data from data sources on a network regardless of data format. The general data extraction procedure is determined from a plurality of pairs of data from the network, each pair including a data source and a program which accurately extracts data from the data source. The pairs of data are processed by a learning system to learn a general program for extracting data from new data sources.
摘要翻译：提供了一种用于产生能够从网络上的数据源提取数据而不管数据格式如何的一般数据提取程序的方法和装置。从多个来自网络的数据对确定一般数据提取过程，每对包括数据源和从数据源准确地提取数据的程序。数据对由学习系统处理，以学习从新数据源提取数据的一般程序。

5. 发明授权

US5481650A Biased learning system 失效
标题翻译：偏倚学习系统
公开(公告)号：US5481650A
公开(公告)日：1996-01-02
申请号：US320102
申请日：1994-10-07
申请人： William W. Cohen
发明人： William W. Cohen
IPC分类号： G06N5/02 , G06F15/18
CPC分类号： G06N5/025
摘要： The invention permits various types of background knowledge for a concept learning system to be represented in a single formal structure known as an antecedent description grammar. A user formulates background knowledge for a learning problem into such a grammar, which then becomes an input to a learning system, together with training data representing the concept to be learned. The learning system, constrained by the grammar, then uses the training data to generate a hypothesis for the concept to be learned. Such hypothesis is in the form of a set of logic clauses known as Horn clauses.
摘要翻译：本发明允许将概念学习系统的各种类型的背景知识以被称为先行描述语法的单个形式结构来表示。用户将学习问题的背景知识制定成这样的语法，然后将该语法与学习系统的输入连同表示要学习的概念的训练数据一起构成。学习系统受到语法约束，然后使用训练数据为要学习的概念生成假设。这样的假设是一组被称为霍恩（Norn）子句的逻辑子句的形式。

6. 发明授权

US06295533B2 System and method for accessing heterogeneous databases 失效
标题翻译：用于访问异构数据库的系统和方法
公开(公告)号：US06295533B2
公开(公告)日：2001-09-25
申请号：US09028471
申请日：1998-02-24
申请人： William W. Cohen
发明人： William W. Cohen
IPC分类号： G06F1730
CPC分类号： G06F17/30566 , Y10S707/99935
摘要： A system and method are provided for answering queries concerning information stored in a set of collections. Each collection includes a structured entity, and each structured entity includes a field. A query is received that specifies a subset of the set of collections and a logical constraint between fields that includes a requirement that a first field match a second field. The probability that the first field matches the second field is determined automatically based upon the contents of the fields. A collection of lists is generated in response to the query, where each list includes members of the subset of collections specified in the query, and where each list has an estimate of the probability that the members of the list satisfies the logical constraint specified in the query.
摘要翻译：提供了一种用于回答关于存储在一组集合中的信息的查询的系统和方法。每个集合包括一个结构化实体，每个结构化实体包括一个字段。接收到一个查询，该查询指定集合集合的子集，以及包含第一个字段与第二个字段匹配的要求的字段之间的逻辑约束。基于字段的内容自动确定第一字段与第二字段匹配的概率。响应于查询生成列表的集合，其中每个列表包括在查询中指定的集合的子集的成员，并且其中每个列表具有对列表的成员满足在查询。

7. 发明授权

US5627945A Biased learning system 失效
标题翻译：偏倚学习系统
公开(公告)号：US5627945A
公开(公告)日：1997-05-06
申请号：US566198
申请日：1995-12-01
申请人： William W. Cohen
发明人： William W. Cohen
IPC分类号： G06N5/02 , G06F3/00
CPC分类号： G06N5/025
摘要： The invention permits various types of background knowledge for a concept learning system to be represented in a single formal structure known as an antecedent description grammar. A user formulates background knowledge for a learning problem into such a grammar, which then becomes an input to a learning system, together with training data representing the concept to be learned. The learning system, constrained by the grammar, then uses the training data to generate a hypothesis for the concept to be learned. Such hypothesis is in the form of a set of logic clauses known as Horn clauses.
摘要翻译：本发明允许将概念学习系统的各种类型的背景知识以被称为先行描述语法的单个形式结构来表示。用户将学习问题的背景知识制定成这样的语法，然后将该语法与学习系统的输入连同表示要学习的概念的训练数据一起构成。学习系统受到语法约束，然后使用训练数据为要学习的概念生成假设。这样的假设是一组被称为霍恩（Norn）子句的逻辑子句的形式。

8. 发明授权

US06418432B1 System and method for finding information in a distributed information system using query learning and meta search 失效
标题翻译：使用查询学习和元搜索在分布式信息系统中查找信息的系统和方法
公开(公告)号：US06418432B1
公开(公告)日：2002-07-09
申请号：US09117312
申请日：1998-07-24
申请人： William W Cohen , Yoram Singer
发明人： William W Cohen , Yoram Singer
IPC分类号： G06F1730
CPC分类号： G06F17/30864 , G06F17/30707 , Y10S707/99935
摘要： An information retrieval system finds information in a Distributed Information System (DIS), e.g. the Internet using query learning and meta search for adding documents to resource directories contained in the DIS. A selection means generates training data characterized as positive and negative examples of a particular class of data residing in the DIS. A learning means generates from the training data at least one query that can be submitted to any one of a plurality of search engines for searching the DIS to find “new” items of the particular class. An evaluation means determines and verifies that the new item(s) is a new subset of the particular class and adds or updates the particular class in the resource directory.
摘要翻译：信息检索系统在分布式信息系统（DIS）中查找信息，例如。互联网使用查询学习和元搜索将文档添加到DIS中包含的资源目录中。选择装置产生表征为驻留在DIS中的特定数据类别的正和负例子的训练数据。学习装置从训练数据生成至少一个可以提交给多个搜索引擎中的任何一个的查询，用于搜索DIS以找到特定类的“新”项。评估装置确定并验证新项目是特定类别的新子集，并添加或更新资源目录中的特定类。

9. 发明授权

US5642472A Software discovery system 失效
标题翻译：软件发现系统
公开(公告)号：US5642472A
公开(公告)日：1997-06-24
申请号：US246437
申请日：1994-05-20
申请人： William W. Cohen
发明人： William W. Cohen
IPC分类号： G06F9/06 , G06F9/44 , G06F11/34 , G06F11/36 , G06F15/18 , G06N5/04 , G06F3/00
CPC分类号： G06F11/3604 , G06F11/34 , G06F11/3624 , G06F11/3636 , G06N99/005 , G06F11/3466
摘要： Apparatus and methods which employ a machine learning system to "learn" the specification for a program from a trace of an execution of the program on a set of test problems. The program is instrumented to produce the trace. Performance is improved by means of a declarative bias which expresses knowledge of the user about the program and constrains the learning system to produce only specifications which are consistent with the declarative bias. The apparatus and methods of the preferred embodiment are employed to learn specifications of views in a data base for a telephone switching system from traces produced by executing the programs which produce the views. Techniques for producing more than one specification and for dealing with views which involve conversions are also disclosed.
摘要翻译：使用机器学习系统从一组测试问题的程序执行跟踪中“学习”程序的规范的装置和方法。该程序用于生成跟踪。通过表达对用户对程序的了解的声明偏差来改进性能，并限制学习系统仅产生与声明偏差一致的规范。采用优选实施例的装置和方法从通过执行产生视图的程序产生的痕迹来学习电话交换系统的数据库中的视图的规范。还公开了用于生产多个规范和处理涉及转换的视图的技术。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式