专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

21. 发明申请

US20050257147A1 Spell checker with arbitrary length string-to-string transformations to improve noisy channel spelling correction 失效
标题翻译：具有任意长度的字符串到字符串转换的拼写检查器，以改善噪声通道拼写校正
公开(公告)号：US20050257147A1
公开(公告)日：2005-11-17
申请号：US11182214
申请日：2005-07-15
申请人： Eric Brill , Robert Moore
发明人： Eric Brill , Robert Moore
IPC分类号： G06F17/24 , G06F17/27 , G10L15/00 , G10L15/18
CPC分类号： G06F17/273 , G10L15/183
摘要： A spell checker based on the noisy channel model has a source model and an error model. The source model determines how likely a word w in a dictionary is to have been generated. The error model determines how likely the word w was to have been incorrectly entered as the string s (e.g., mistyped or incorrectly interpreted by a speech recognition system) according to the probabilities of string-to-string edits. The string-to-string edits allow conversion of one arbitrary length character sequence to another arbitrary length character sequence.
摘要翻译：基于噪声通道模型的拼写检查器具有源模型和误差模型。源模型确定字典中字w的生成可能性。错误模型根据字符串到字符串编辑的概率确定字w被错误地输入为字符串s（例如，由语音识别系统错误地或不正确地解释）的可能性。字符串到字符串的编辑允许将一个任意长度的字符序列转换为另一个任意长度的字符序列。

22. 发明申请

US20050203878A1 User intent discovery 有权
标题翻译：用户意图发现
公开(公告)号：US20050203878A1
公开(公告)日：2005-09-15
申请号：US10796378
申请日：2004-03-09
申请人： Eric Brill , Harold Daume
发明人： Eric Brill , Harold Daume
IPC分类号： G06F7/00 , G06F15/02 , G06F17/30 , H04Q7/32
CPC分类号： G06F17/3064 , G06F17/30672 , G06F17/30867 , Y10S707/99931 , Y10S707/99933 , Y10S707/99934
摘要： a system 100 that facilitates determining a user's intent given a user search query comprises a search engine that is employed to search over a collection of objects within a data store to retrieve a user search result set. The objects within the result set are associated with queries that were previously utilized to locate such objects. A level of relatedness between the previous queries and the user search query is determined, and previous queries that are associated with a result set that is novel and related to the user search result set are returned to the user.
摘要翻译：有助于确定给定用户搜索查询的用户意图的系统100包括用于搜索数据存储中的对象集合以检索用户搜索结果集的搜索引擎。结果集中的对象与以前用于定位此类对象的查询相关联。确定先前查询和用户搜索查询之间的相关性水平，并且将与结果集相关联且与用户搜索结果集相关联的先前查询返回给用户。

23. 发明申请

US20070239702A1 Using connectivity distance for relevance feedback in search 有权
标题翻译：使用连接距离搜索相关反馈
公开(公告)号：US20070239702A1
公开(公告)日：2007-10-11
申请号：US11393480
申请日：2006-03-30
申请人： Serguei Vassilvitskii , Eric Brill
发明人： Serguei Vassilvitskii , Eric Brill
IPC分类号： G06F17/30
CPC分类号： G06F17/30864 , Y10S707/99933 , Y10S707/99934 , Y10S707/99935 , Y10S707/99936
摘要： A unique system and method is provided that facilitates improving relevance of search results over the initial searching ranking. The system and method involve obtaining relevancy feedback for at least one search result (user rated) and then generating a connectivity graph or web-graph (for Web searches) for the user rated result. The relative distance between results (or pages) in the graph can indicate relevancy between those results. Thus, results within a particular distance from the rated result can be considered related to the rated result and thus, relevant or irrelevant depending on the particular rating for that result. The connectivity graph can be employed to determine a re-ranking of the search results.
摘要翻译：提供了一种独特的系统和方法，其有助于提高搜索结果与初始搜索排名的相关性。该系统和方法包括获得至少一个搜索结果（用户评级）的相关性反馈，然后为用户评分结果生成连通性图或网络图（用于Web搜索）。图中的结果（或页面）之间的相对距离可以指示这些结果之间的相关性。因此，与额定结果特定距离的结果可以被认为与额定结果相关，因此，根据该结果的特定评级，相关或不相关。可以使用连通性图来确定搜索结果的重新排序。

24. 发明申请

US20050273317A1 Method and apparatus for unsupervised training of natural language processing units 有权
标题翻译：自然语言处理单元无人训练的方法和装置
公开(公告)号：US20050273317A1
公开(公告)日：2005-12-08
申请号：US11204213
申请日：2005-08-15
申请人： Eric Brill , Arul Menezes
发明人： Eric Brill , Arul Menezes
IPC分类号： G06F17/27
CPC分类号： G06F17/274
摘要： A method of training a natural language processing unit applies a candidate learning set to at least one component of the natural language unit. The natural language unit is then used to generate a meaning set from a first corpus. A second meaning set is generated from a second corpus using a second natural language unit and the two meaning sets are compared to each other to form a score for the candidate learning set. This score is used to determine whether to modify the natural language unit based on the candidate learning set.
摘要翻译：训练自然语言处理单元的方法将候选学习集合应用于自然语言单元的至少一个分量。然后，自然语言单元用于从第一语料库生成意义集。使用第二自然语言单元从第二语料库生成第二含义集合，并且将两个含义集合彼此进行比较以形成候选学习集合的分数。该分数用于确定是否基于候选学习集修改自然语言单元。

25. 发明申请

US20050251744A1 Spell checker with arbitrary length string-to-string transformations to improve noisy channel spelling correction 失效
公开(公告)号：US20050251744A1
公开(公告)日：2005-11-10
申请号：US11182388
申请日：2005-07-15
申请人： Eric Brill , Robert Moore
发明人： Eric Brill , Robert Moore
IPC分类号： G06F17/24 , G06F17/27 , G10L15/00 , G10L15/18
CPC分类号： G06F17/273 , G10L15/183
摘要： A spell checker based on the noisy channel model has a source model and an error model. The source model determines how likely a word w in a dictionary is to have been generated. The error model determines how likely the word w was to have been incorrectly entered as the string s (e.g., mistyped or incorrectly interpreted by a speech recognition system) according to the probabilities of string-to-string edits. The string-to-string edits allow conversion of one arbitrary length character sequence to another arbitrary length character sequence.

26. 发明申请

US20050234904A1 Systems and methods that rank search results 有权
标题翻译：搜索结果排名的系统和方法
公开(公告)号：US20050234904A1
公开(公告)日：2005-10-20
申请号：US10820947
申请日：2004-04-08
申请人： Eric Brill , Jesper Lind , Marc Smith , Wensi Xi , Duncan Davenport
发明人： Eric Brill , Jesper Lind , Marc Smith , Wensi Xi , Duncan Davenport
IPC分类号： G06F7/00 , G06F17/30
CPC分类号： G06F17/30675
摘要： The present invention provides systems and methods that rank search results. Such ranking typically includes determining a relevance of individual search results via one or more feature-based relevance functions. These functions can be tailored to users and/or applications, and typically are based on scoped information (e.g., lexical), digital artifact author related attributes, digital artifact source repository attributes, and/or relationships between features, for example. In addition, relevance functions can be generated via training sets (e.g., machine learning) or initial guesses that are iteratively refined over time. Upon determining relevance, search results can be ordered with respect to one another, based on respective relevances. Additionally, thresholding can be utilized to mitigate returning results likely to be non-relevant to the query, user and/or application.
摘要翻译：本发明提供了对搜索结果进行排序的系统和方法。这种排名通常包括通过一个或多个基于特征的相关性功能来确定各个搜索结果的相关性。这些功能可以针对用户和/或应用而定制，并且通常基于例如范围限定的信息（例如，词汇），数字人工制品相关属性，数字工件源存储库属性和/或特征之间的关系。此外，可以通过随时间迭代地改进的训练集（例如，机器学习）或初始猜测来生成相关函数。在确定相关性之后，可以基于相应的相关性来相对于彼此订购搜索结果。此外，可以利用阈值来减轻可能与查询，用户和/或应用程序不相关的返回结果。

27. 发明申请

US20050165753A1 Building and using subwebs for focused search 失效
标题翻译：建立和使用子网进行重点搜索
公开(公告)号：US20050165753A1
公开(公告)日：2005-07-28
申请号：US10778498
申请日：2004-02-13
申请人： Harr Chen , Raman Chandrasekar , Simon Corston , Eric Brill
发明人： Harr Chen , Raman Chandrasekar , Simon Corston , Eric Brill
IPC分类号： G06N99/00 , G06F17/00 , G06F17/30 , G06F7/00
CPC分类号： G06F17/30867 , Y10S707/99933
摘要： A system that facilitates performance of a focused search over a collection of sites comprises a subweb that corresponds to a topic and/or user characteristic(s) that are of interest to the user. The subweb includes a plurality of domains and/or paths (e.g. sites) that are related to the topic and/or the user characteristic(s). Each of the sites within the subweb is assigned a weight that indicates relevance of the site to the desirable topic and/or user characteristic(s). A search engine employs the subweb to facilitate focusing a search over a collection of sites. The search engine receives a query, and utilizes the subweb to focus a search over the selection of sites corresponding to the topic and/or user characteristic(s) represented by the subweb. The results from the search are returned to the user based at least in part upon the relevance weights assigned to the sites within the subweb.
摘要翻译：有助于在站点集合上进行聚焦搜索的性能的系统包括对应于用户感兴趣的主题和/或用户特征的子网。子网包括与主题和/或用户特征相关的多个域和/或路径（例如站点）。子网站中的每个站点都被分配一个权重，指示站点与期望主题和/或用户特征的相关性。搜索引擎使用子网站来促进将搜索集中在一系列网站上。搜索引擎接收查询，并利用子网将搜索集中在与由子网站所代表的主题和/或用户特征相对应的站点的选择上。至少部分地基于分配给子网站内的站点的相关性权重将搜索结果返回给用户。

28. 发明申请

US20050033711A1 Cost-benefit approach to automatically composing answers to questions by extracting information from large unstructured corpora 有权
标题翻译：通过从大型非结构化语料库中提取信息来自动构成问题答案的成本效益方法
公开(公告)号：US20050033711A1
公开(公告)日：2005-02-10
申请号：US10635274
申请日：2003-08-06
申请人： Eric Horvitz , David Azari , Susan Dumais , Eric Brill
发明人： Eric Horvitz , David Azari , Susan Dumais , Eric Brill
IPC分类号： G06F17/30 , G06F17/00 , G06F7/00 , G06N5/02
CPC分类号： G06F17/30684 , G06F17/30687 , Y10S707/99933
摘要： The present invention relates to a system and methodology to facilitate extraction of information from a large unstructured corpora such as from the World Wide Web and/or other unstructured sources. Information in the form of answers to questions can be automatically composed from such sources via probabilistic models and cost-benefit analyses to guide resource-intensive information-extraction procedures employed by a knowledge-based question answering system. The analyses can leverage predictions of the ultimate quality of answers generated by the system provided by Bayesian or other statistical models. Such predictions, when coupled with a utility model can provide the system with the ability to make decisions about the number of queries issued to a search engine (or engines), given the cost of queries and the expected value of query results in refining an ultimate answer. Given a preference model, information extraction actions can be taken with the highest expected utility. In this manner, the accuracy of answers to questions can be balanced with the cost of information extraction and analysis to compose the answers.
摘要翻译：本发明涉及一种便利从诸如万维网和/或其他非结构化来源的大型非结构化语料库提取信息的系统和方法。通过概率模型和成本效益分析，可以通过这些来源自动构成问题答案形式的信息，以指导基于知识的问答系统采用的资源密集型信息提取程序。分析可以利用由贝叶斯或其他统计模型提供的系统生成的答案的最终质量的预测。当与实用新型相结合时，这种预测可以为系统提供对发出给搜索引擎（或引擎）的查询数量的决定的能力，考虑到查询的成本和查询结果的期望值来提炼最终的回答。给定一个偏好模型，可以采用最高预期效用的信息提取动作。以这种方式，可以将问题答案的准确性与信息提取和分析的成本进行平衡，以构成答案。

29. 发明申请

US20070016616A1 AUTOMATED ERROR CHECKING SYSTEM AND METHOD 失效
标题翻译：自动错误检查系统和方法
公开(公告)号：US20070016616A1
公开(公告)日：2007-01-18
申请号：US11533483
申请日：2006-09-20
申请人： Eric Brill , Robert Rounthwaite
发明人： Eric Brill , Robert Rounthwaite
IPC分类号： G06F17/00 , G06F17/30 , G06F17/21
CPC分类号： G06F17/273 , Y10S707/99935 , Y10S707/99936 , Y10S707/99942
摘要： The present invention relates to a system and methodology to facilitate automated error correction of user input data via an analysis of the input data in accordance with an automatically generated and filtered database of processed structural groupings or formulations selected and filtered from past user activities. The filtered database provides a relevant foundation of potential phrases, topics, symbols, speech and/or colloquial structures of interest to users—which are automatically determined from previous user activity, and employed to facilitate automated error checking in accordance with the user's current input, command and/or request for information.
摘要翻译：本发明涉及一种通过根据经过处理的结构分组或从过去的用户活动中选择和过滤的公式的自动生成和过滤的数据库对输入数据的分析来促进用户输入数据的自动纠错的系统和方法。经筛选的数据库提供了用户感兴趣的潜在短语，主题，符号，语音和/或口语结构的相关基础，这些结构根据以前的用户活动自动确定，并用于根据用户当前输入进行自动错误检查，命令和/或请求信息。

30. 发明申请

US20060294037A1 COST-BENEFIT APPROACH TO AUTOMATICALLY COMPOSING ANSWERS TO QUESTIONS BY EXTRACTING INFORMATION FROM LARGE UNSTRUCTURED CORPORA 有权
公开(公告)号：US20060294037A1
公开(公告)日：2006-12-28
申请号：US11469136
申请日：2006-08-31
申请人： Eric Horvitz , David Azari , Susan Dumais , Eric Brill
发明人： Eric Horvitz , David Azari , Susan Dumais , Eric Brill
IPC分类号： G06N5/02 , G06F17/00
CPC分类号： G06F17/30684 , G06F17/30687 , Y10S707/99933
摘要： The present invention relates to a system and methodology to facilitate extraction of information from a large unstructured corpora such as from the World Wide Web and/or other unstructured sources. Information in the form of answers to questions can be automatically composed from such sources via probabilistic models and cost-benefit analyses to guide resource-intensive information-extraction procedures employed by a knowledge-based question answering system. The analyses can leverage predictions of the ultimate quality of answers generated by the system provided by Bayesian or other statistical models. Such predictions, when coupled with a utility model can provide the system with the ability to make decisions about the number of queries issued to a search engine (or engines), given the cost of queries and the expected value of query results in refining an ultimate answer. Given a preference model, information extraction actions can be taken with the highest expected utility. In this manner, the accuracy of answers to questions can be balanced with the cost of information extraction and analysis to compose the answers.

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式