专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US07620539B2 Methods and apparatuses for identifying bilingual lexicons in comparable corpora using geometric processing 有权
标题翻译：使用几何加工识别可比语料库中的双语词典的方法和装置
公开(公告)号：US07620539B2
公开(公告)日：2009-11-17
申请号：US10976847
申请日：2004-11-01
申请人： Eric Gaussier , Jean-Michel Renders , Herve Dejean , Cyril Goutte , Irina Matveeva
发明人： Eric Gaussier , Jean-Michel Renders , Herve Dejean , Cyril Goutte , Irina Matveeva
IPC分类号： G06F17/28 , G06F17/20 , G06F17/27 , G06F17/21 , G06F17/30
CPC分类号： G06F17/2785 , G06F17/2735 , G06F17/2827 , Y10S707/99937
摘要： Various methods formulated using a geometric interpretation for identifying bilingual pairs in comparable corpora using a bilingual dictionary are disclosed. The methods may be used separately or in combination to compute the similarity between bilingual pairs.
摘要翻译：公开了使用双语字典使用几何解释来识别可比语料库中的双语对的各种方法。这些方法可以单独使用或组合使用来计算双语对之间的相似度。

2. 发明申请

US20070005340A1 Incremental training for probabilistic categorizer 有权
标题翻译：概率分类器的增量训练
公开(公告)号：US20070005340A1
公开(公告)日：2007-01-04
申请号：US11170019
申请日：2005-06-29
申请人： Cyril Goutte , Eric Gaussier
发明人： Cyril Goutte , Eric Gaussier
IPC分类号： G06F17/27
CPC分类号： G06F17/277 , G06F17/3071
摘要： A probabilistic document categorizer has an associated vocabulary of words and an associated plurality of probabilistic categorizer parameters derived from a collection of documents. A new document is received. The probabilistic categorizer parameters are updated to reflect addition of the new document to the collection of documents based on vocabulary words contained in the new document, a category of the new document, and a collection size parameter indicative of an effective total number of instances of vocabulary words in the collection of documents.
摘要翻译：概率文档分类器具有从文档集合导出的词的相关词汇和相关联的多个概率分类器参数。收到一份新的文件。更新概率分类器参数以反映新文档的添加，基于新文档中包含的词汇单，新文档的类别以及指示词汇的有效实例总数的集合大小参数来收集文档在收集文件中的单词。

3. 发明申请

US20060190241A1 Apparatus and methods for aligning words in bilingual sentences 失效
标题翻译：双语句子对齐词的装置和方法
公开(公告)号：US20060190241A1
公开(公告)日：2006-08-24
申请号：US11137590
申请日：2005-05-26
申请人： Cyril Goutte , Michel Simard , Kenji Yamada , Eric Gaussier , Arne Mauser
发明人： Cyril Goutte , Michel Simard , Kenji Yamada , Eric Gaussier , Arne Mauser
IPC分类号： G06F17/28
CPC分类号： G06F17/2827
摘要： Methods are disclosed for performing proper word alignment that satisfy constraints of coverage and transitive closure. Initially, a translation matrix which defines word association measures between source and target words of a corpus of bilingual translations of source and target sentences is computed. Subsequently, in a first method, the association measures in the translation matrix are factorized and orthogonalized to produce cepts for the source and target words, which resulting matrix factors may then be, optionally, multiplied to produce an alignment matrix. In a second method, the association measures in the translation matrix are thresholded, and then closed by transitivity, to produce an alignment matrix, which may then be, optionally, factorized to produce cepts. The resulting cepts or alignment matrices may then be used by any number of natural language applications for identifying words that are properly aligned.
摘要翻译：公开了用于执行满足覆盖和传递闭包约束的适当字对齐的方法。最初，计算了定义源语句和目标语句双语翻译语料库的源词和目标词之间的词关联度量的翻译矩阵。随后，在第一种方法中，翻译矩阵中的关联度量被分解和正交化以产生源词和目标词的尖叫，所得到的矩阵因子然后可以被乘以以产生对齐矩阵。在第二种方法中，翻译矩阵中的关联度量被阈值化，然后由传递性闭合，以产生对准矩阵，其可以随后被分解以产生尖叫。所得到的尖叫或对齐矩阵然后可以被任何数量的自然语言应用程序用于识别正确对准的单词。

4. 发明授权

US07849087B2 Incremental training for probabilistic categorizer 有权
标题翻译：概率分类器的增量训练
公开(公告)号：US07849087B2
公开(公告)日：2010-12-07
申请号：US11170019
申请日：2005-06-29
申请人： Cyril Goutte , Eric Gaussier
发明人： Cyril Goutte , Eric Gaussier
IPC分类号： G06F7/00 , G06F17/30
CPC分类号： G06F17/277 , G06F17/3071
摘要： A probabilistic document categorizer has an associated vocabulary of words and an associated plurality of probabilistic categorizer parameters derived from a collection of documents. A new document is received. The probabilistic categorizer parameters are updated to reflect addition of the new document to the collection of documents based on vocabulary words contained in the new document, a category of the new document, and a collection size parameter indicative of an effective total number of instances of vocabulary words in the collection of documents.
摘要翻译：概率文档分类器具有从文档集合导出的词的相关词汇和相关联的多个概率分类器参数。收到一份新的文件。更新概率分类器参数以反映新文档的添加，基于新文档中包含的词汇单，新文档的类别以及指示词汇的有效实例总数的集合大小参数来收集文档在收集文件中的单词。

5. 发明授权

US07536295B2 Machine translation using non-contiguous fragments of text 失效
标题翻译：机器翻译使用不连续的文本片段
公开(公告)号：US07536295B2
公开(公告)日：2009-05-19
申请号：US11315043
申请日：2005-12-22
申请人： Nicola Cancedda , Bruno Cavestro , Marc Dymetman , Eric Gaussier , Cyril Goutte , Michel Simard , Kenji Yamada
发明人： Nicola Cancedda , Bruno Cavestro , Marc Dymetman , Eric Gaussier , Cyril Goutte , Michel Simard , Kenji Yamada
IPC分类号： G06F17/28
CPC分类号： G06F17/2827
摘要： A machine translation method for translating source text from a first language to target text in a second language includes receiving the source text in the first language and accessing a library of bi-fragments, each of the bi-fragments including a text fragment from the first language and a text fragment from the second language, at least some of the bi-fragments comprising non-contiguous bi-fragments in which at least one of the text fragment from the first language and the text fragment from the second language comprises a non-contiguous fragment.
摘要翻译：用于将源文本从第一语言翻译成以第二语言的目标文本的机器翻译方法包括以第一语言接收源文本并访问双片段的库，每个双片段包括来自第一语言的文本片段语言和来自第二语言的文本片段，至少一些双片段包括非连续双片段，其中来自第一语言的文本片段和来自第二语言的文本片段中的至少一个包含非连续双片段，连续片段

6. 发明授权

US07139754B2 Method for multi-class, multi-label categorization using probabilistic hierarchical modeling 有权
标题翻译：使用概率分层建模的多类，多标签分类方法
公开(公告)号：US07139754B2
公开(公告)日：2006-11-21
申请号：US10774966
申请日：2004-02-09
申请人： Cyril Goutte , Eric Gaussier
发明人： Cyril Goutte , Eric Gaussier
IPC分类号： G06F17/30
CPC分类号： G06F17/30707 , Y10S707/99933 , Y10S707/99934 , Y10S707/99935 , Y10S707/99936
摘要： A method of categorizing objects in which there can be multiple categories of objects and each object can belong to more than one category is described. The method defines a set of categories in which at least one category is dependent on another category and then organizes the categories in a hierarchy that embodies any dependencies among them. Each object is assigned to one or more categories in the set. A set of labels corresponding to all combinations of any number of the categories is defined, wherein if an object is relevant to several categories, the object must be assigned the label corresponding to the subset of all relevant categories. Once the new labels are defined, the multi-category, multi-label problem has been reduced to a multi-category, single-label problem, and the categorization task is reduced down to choosing the single best label set for an object.
摘要翻译：描述了可以存在多个类别的对象和每个对象可以属于多于一个类别的对象的分类方法。该方法定义了一组类别，其中至少一个类别依赖于另一个类别，然后组织在体现其中的任何依赖关系的层次结构中的类别。每个对象被分配到集合中的一个或多个类别。定义对应于任何数量的类别的所有组合的一组标签，其中如果对象与若干类别相关，则该对象必须被分配与所有相关类别的子集相对应的标签。一旦定义了新标签，多类别，多标签问题已经被减少到多类别的单标签问题，并且分类任务减少到为对象选择单个最佳标签集。

7. 发明授权

US07672830B2 Apparatus and methods for aligning words in bilingual sentences 失效
标题翻译：双语句子对齐词的装置和方法
公开(公告)号：US07672830B2
公开(公告)日：2010-03-02
申请号：US11137590
申请日：2005-05-26
申请人： Cyril Goutte , Michel Simard , Kenji Yamada , Eric Gaussier , Arne Mauser
发明人： Cyril Goutte , Michel Simard , Kenji Yamada , Eric Gaussier , Arne Mauser
IPC分类号： G06F17/28 , G06F17/27
CPC分类号： G06F17/2827
摘要： Methods are disclosed for performing proper word alignment that satisfy constraints of coverage and transitive closure. Initially, a translation matrix which defines word association measures between source and target words of a corpus of bilingual translations of source and target sentences is computed. Subsequently, in a first method, the association measures in the translation matrix are factorized and orthogonalized to produce cepts for the source and target words, which resulting matrix factors may then be, optionally, multiplied to produce an alignment matrix. In a second method, the association measures in the translation matrix are thresholded, and then closed by transitivity, to produce an alignment matrix, which may then be, optionally, factorized to produce cepts. The resulting cepts or alignment matrices may then be used by any number of natural language applications for identifying words that are properly aligned.
摘要翻译：公开了用于执行满足覆盖和传递闭包约束的适当字对齐的方法。最初，计算了定义源语句和目标语句双语翻译语料库的源词和目标词之间的词关联度量的翻译矩阵。随后，在第一种方法中，翻译矩阵中的关联度量被分解和正交化以产生源词和目标词的尖叫，所得到的矩阵因子然后可以被乘以以产生对齐矩阵。在第二种方法中，翻译矩阵中的关联度量被阈值化，然后由传递性闭合，以产生对准矩阵，其可以随后被分解以产生尖叫。所得到的尖叫或对齐矩阵然后可以被任何数量的自然语言应用程序用于识别正确对准的单词。

8. 发明授权

US07542893B2 Machine translation using elastic chunks 失效
标题翻译：机械翻译使用弹性块
公开(公告)号：US07542893B2
公开(公告)日：2009-06-02
申请号：US11431393
申请日：2006-05-10
申请人： Nicola Cancedda , Marc Dymetman , Eric Gaussier , Cyril Goutte
发明人： Nicola Cancedda , Marc Dymetman , Eric Gaussier , Cyril Goutte
IPC分类号： G06F17/28
CPC分类号： G06F17/2818
摘要： A machine translation method includes receiving source text in a first language and retrieving text fragments in a target language from a library of bi-fragments to generate a target hypothesis. Each bi-fragment includes a text fragment from the first language and a corresponding text fragment from the second language. Some of the bi-fragments are modeled as elastic bi-fragments where a gap between words is able to assume a variable size corresponding to a number of other words to occupy the gap. The target hypothesis is evaluated with a translation scoring function which scores the target hypothesis according to a plurality of feature functions, at least one of the feature functions comprising a gap size scoring feature which favors hypotheses with statistically more probable gap sizes over hypotheses with statically less probable gap sizes.
摘要翻译：机器翻译方法包括以第一语言接收源文本并且从双片段的库中检索目标语言中的文本片段以生成目标假设。每个双片段包括来自第一语言的文本片段和来自第二语言的相应文本片段。一些双片段被建模为弹性双片段，其中词之间的间隙能够采用与多个其他单词相对应的可变大小来占据间隙。目标假设用翻译评分函数评估，其根据多个特征函数对目标假设进行评分，特征函数中的至少一个包括间隙大小评分特征，其有利于具有统计学上更可能的间隔大小超过假设的假设，具有静态较小可能的间隙大小。

9. 发明申请

US20060136410A1 Method and apparatus for explaining categorization decisions 有权
标题翻译：用于解释分类决定的方法和装置
公开(公告)号：US20060136410A1
公开(公告)日：2006-06-22
申请号：US11013365
申请日：2004-12-17
申请人： Eric Gaussier , Cyril Goutte
发明人： Eric Gaussier , Cyril Goutte
IPC分类号： G06F17/30
CPC分类号： G06K9/623 , Y10S707/99933
摘要： Feature selection is used to determine feature influence for a given categorization decision to identify those features in a categorized document that were important in classifying the document into one or more classes. In one embodiment, model parameters of a categorization model are used to determine the features that contributed to the categorization decision of a document. In another embodiment, the model parameters of the categorization model and the features of the categorized document are used to determine the features that contributed to the categorization decision of a document.
摘要翻译：特征选择用于确定给定分类决定的特征影响，以识别分类文档中将文档分类到一个或多个类中很重要的那些特征。在一个实施例中，分类模型的模型参数用于确定有助于文档的分类决定的特征。在另一个实施例中，使用分类模型的模型参数和分类文档的特征来确定有助于文档的分类决定的特征。

10. 发明申请

US20050187892A1 Method for multi-class, multi-label categorization using probabilistic hierarchical modeling 有权
标题翻译：使用概率分层建模的多类，多标签分类方法
公开(公告)号：US20050187892A1
公开(公告)日：2005-08-25
申请号：US10774966
申请日：2004-02-09
申请人： Cyril Goutte , Eric Gaussier
发明人： Cyril Goutte , Eric Gaussier
IPC分类号： G06F7/00 , G06F17/30
CPC分类号： G06F17/30707 , Y10S707/99933 , Y10S707/99934 , Y10S707/99935 , Y10S707/99936
摘要： A method for categorizing a set of objects includes defining a set of categories in which at least one category in the set is dependent on another category in the set; organizing the set of categories in a hierarchy that embodies any dependencies among the categories in the set; for each object, assigning to the object one or more categories l1 . . . lP where liε{1 . . . L} from a set {1 . . . L} of possible categories, wherein the assigned categories represent a subset of categories for which the object is relevant; defining a new set of labels z comprising all possible combinations of any number of the categories, zε{{1},{2}, . . . {L},{1,2}, . . . {1,L},{2,3}, . . . {1,2,3}, . . . {1,2, . . . L}}, such that if an object is relevant to several categories, the object must be assigned the label z corresponding to the subset of all relevant categories; and assigning to the object the several categories and the subcategories of the several categories.
摘要翻译：用于对一组对象进行分类的方法包括定义一组类别，其中集合中的至少一个类别依赖于集合中的另一类别; 在层次结构中组织一组体现集合中类别之间依赖关系的类别; 对于每个对象，向对象分配一个或多个类别l 1 。。。其中，1≤ε≤1。。。 L}从集合{1。。。 L}的可能类别，其中所分配的类别表示对象对应的类别的子集; 定义一组新的标签z，其包括任何数量的类别的所有可能组合，zepsilon {{1}，{2}，...。。。 {L}，{1,2}，。。。 {1，L}，{2,3}，。。。 {1,2,3}，。。。 {1,2，。。。 L}}，使得如果对象与几个类别相关，则必须向对象分配与所有相关类别的子集相对应的标签z; 并向对象分配几个类别和几个类别的子类别。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式