会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 22. 发明授权
    • Apparatus and methods for an information retrieval system that employs
natural language processing of search results to improve overall
precision
    • 一种信息检索系统的装置和方法,其使用搜索结果的自然语言处理来提高总体精度
    • US5933822A
    • 1999-08-03
    • US898652
    • 1997-07-22
    • Lisa Braden-HarderSimon H. CorstonWilliam B. DolanLucy H. Vanderwende
    • Lisa Braden-HarderSimon H. CorstonWilliam B. DolanLucy H. Vanderwende
    • G06F17/28G06F17/30G06F17/00
    • G06F17/30684Y10S707/99932Y10S707/99933Y10S707/99934Y10S707/99935Y10S707/99936
    • Apparatus and accompanying methods for an information retrieval system that utilizes natural language processing to process results retrieved by, for example, an information retrieval engine such as a conventional statistical-based search engine, in order to improve overall precision. Specifically, such a search ultimately yields a set of retrieved documents. Each such document is then subjected to natural language processing to produce a set of logical forms. Each such logical form encodes, in a word-relation-word manner, semantic relationships, particularly argument and adjunct structure, between words in a phrase. A user-supplied query is analyzed in the same manner to yield a set of corresponding logical forms therefor. Documents are ranked as a predefined function of the logical forms from the documents and the query. Specifically, the set of logical forms for the query is then compared against a set of logical forms for each of the retrieved documents in order to ascertain a match between any such logical forms in both sets. Each document that has at least one matching logical forms is heuristically scored, with each different relation for a matching logical forms being assigned a different corresponding predefined weight. The score of each such document is, e.g., a predefined function of the weights of its uniquely matching logical forms. Finally, the retained documents are ranked in order of descending score and then presented to a user in that order.
    • 用于信息检索系统的设备和附带方法,其利用自然语言处理来处理例如诸如常规的基于统计的搜索引擎之类的信息检索引擎检索的结果,以便提高总体精度。 具体来说,这样的搜索最终产生一组检索的文档。 然后,每个这样的文档都进行自然语言处理,以产生一组逻辑形式。 每个这样的逻辑形式以单词关系字的方式编码短语中的单词之间的语义关系,特别是参数和辅助结构。 以相同的方式分析用户提供的查询,以产生一组相应的逻辑形式。 文档被排序为来自文档和查询的逻辑表单的预定义函数。 具体来说,然后将查询的逻辑形式集合与针对每个检索到的文档的一组逻辑形式进行比较,以便确定两个集合中任何这样的逻辑形式之间的匹配。 具有至少一个匹配逻辑形式的每个文档被启发式地评分,对于匹配的逻辑形式的每个不同关系被分配不同的对应的预定权重。 每个这样的文档的得分是例如其唯一匹配的逻辑形式的权重的预定函数。 最后,保留的文档按照降序的顺序排列,然后以该顺序呈现给用户。
    • 23. 发明申请
    • Stimulus Description Collections
    • 刺激描述收藏
    • US20120109623A1
    • 2012-05-03
    • US12916951
    • 2010-11-01
    • William B. DolanDavid L. Chen
    • William B. DolanDavid L. Chen
    • G06F17/28
    • G06F17/2827
    • The subject disclosure generally describes a technology by which text and/or speech descriptions are collected by showing a stimulus such as video clips to contributors (e.g., of a crowd-sourcing service). The descriptions, which are in the language of each contributor's choice, are of the same stimulus and thus associated with one another. While each contributor may be monolingual, the technique allows for the collection of approximately bilingual data, since more than one language may be represented among the different contributors. The descriptions may be used as translation data for training a machine translation engine, and as paraphrase data (grouped by the same language) for training a machine paraphrasing system. Also described is evaluating the quality of a machine paraphrasing system via a distinctiveness metric.
    • 主题公开一般地描述了通过向贡献者(例如,群众来源服务)显示诸如视频剪辑的刺激来收集文本和/或语音描述的技术。 每个贡献者选择的语言的描述是相同的刺激,因此彼此相关联。 虽然每个贡献者可以是单语的,但是该技术允许收集大致双语数据,因为可以在不同的贡献者中表示多于一种的语言。 这些描述可以用作训练机器翻译引擎的翻译数据,以及用于训练机器翻译系统的转义数据(以相同的语言分组)。 还描述了通过独特性度量来评估机器释义系统的质量。
    • 30. 发明授权
    • Bootstrapping sense characterizations of occurrences of polysemous words
    • 多义词发生的引导意义表征
    • US06078878A
    • 2000-06-20
    • US904422
    • 1997-07-31
    • William B. Dolan
    • William B. Dolan
    • G06F17/27
    • G06F17/2795
    • The present invention is directed to characterizing the sense of an occurrence of a polysemous word in a representation of a dictionary. In a preferred embodiment, the representation of the dictionary is made up of a plurality of text segments containing word occurrences having a word sense characterization and word occurrences not having a word sense characterization. The embodiment first selects a plurality of the dictionary text segments that each contain a first word. The embodiment then identifies from among the selected text segments a first and a second occurrence of a second word. The identified second occurrence of the second word has a word sense characterization. The embodiment then attributes to the first occurrence of the second word sense characterization of the second occurrence of the second word.
    • 本发明旨在表征词典表示中多义词的发生的感觉。 在优选实施例中,词典的表示由多个文本段组成,该文本段包含具有单词检测特性的单词出现和不具有词义特征的单词出现。 该实施例首先选择每个包含第一个字的多个字典文本段。 然后,该实施例从选定的文本段中识别第二个字的第一和第二个出现。 所识别的第二个字的第二次出现具有词义特性。 然后,该实施例归因于第二个字的第二次出现的第二个字感应表征的第一次出现。