会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 12. 发明申请
    • Method for multi-class, multi-label categorization using probabilistic hierarchical modeling
    • 使用概率分层建模的多类,多标签分类方法
    • US20050187892A1
    • 2005-08-25
    • US10774966
    • 2004-02-09
    • Cyril GoutteEric Gaussier
    • Cyril GoutteEric Gaussier
    • G06F7/00G06F17/30
    • G06F17/30707Y10S707/99933Y10S707/99934Y10S707/99935Y10S707/99936
    • A method for categorizing a set of objects includes defining a set of categories in which at least one category in the set is dependent on another category in the set; organizing the set of categories in a hierarchy that embodies any dependencies among the categories in the set; for each object, assigning to the object one or more categories l1 . . . lP where liε{1 . . . L} from a set {1 . . . L} of possible categories, wherein the assigned categories represent a subset of categories for which the object is relevant; defining a new set of labels z comprising all possible combinations of any number of the categories, zε{{1},{2}, . . . {L},{1,2}, . . . {1,L},{2,3}, . . . {1,2,3}, . . . {1,2, . . . L}}, such that if an object is relevant to several categories, the object must be assigned the label z corresponding to the subset of all relevant categories; and assigning to the object the several categories and the subcategories of the several categories.
    • 用于对一组对象进行分类的方法包括定义一组类别,其中集合中的至少一个类别依赖于集合中的另一类别; 在层次结构中组织一组体现集合中类别之间依赖关系的类别; 对于每个对象,向对象分配一个或多个类别l 1 。 。 。 其中,1≤ε≤1。 。 。 L}从集合{1。 。 。 L}的可能类别,其中所分配的类别表示对象对应的类别的子集; 定义一组新的标签z,其包括任何数量的类别的所有可能组合,zepsilon {{1},{2},...。 。 。 {L},{1,2},。 。 。 {1,L},{2,3},。 。 。 {1,2,3},。 。 。 {1,2,。 。 。 L}},使得如果对象与几个类别相关,则必须向对象分配与所有相关类别的子集相对应的标签z; 并向对象分配几个类别和几个类别的子类别。
    • 13. 发明申请
    • Machine translation using elastic chunks
    • 机械翻译使用弹性块
    • US20070265825A1
    • 2007-11-15
    • US11431393
    • 2006-05-10
    • Nicola CanceddaMarc DymetmanEric GaussierCyril Goutte
    • Nicola CanceddaMarc DymetmanEric GaussierCyril Goutte
    • G06F17/28
    • G06F17/2818
    • A machine translation method includes receiving source text in a first language and retrieving text fragments in a target language from a library of bi-fragments to generate a target hypothesis. Each bi-fragment includes a text fragment from the first language and a corresponding text fragment from the second language. Some of the bi-fragments are modeled as elastic bi-fragments where a gap between words is able to assume a variable size corresponding to a number of other words to occupy the gap. The target hypothesis is evaluated with a translation scoring function which scores the target hypothesis according to a plurality of feature functions, at least one of the feature functions comprising a gap size scoring feature which favors hypotheses with statistically more probable gap sizes over hypotheses with statically less probable gap sizes.
    • 机器翻译方法包括以第一语言接收源文本并且从双片段的库中检索目标语言中的文本片段以生成目标假设。 每个双片段包括来自第一语言的文本片段和来自第二语言的相应文本片段。 一些双片段被建模为弹性双片段,其中词之间的间隙能够采用与多个其他单词相对应的可变大小来占据间隙。 目标假设用翻译评分函数评估,其根据多个特征函数对目标假设进行评分,特征函数中的至少一个包括间隙大小评分特征,其有利于具有统计学上更可能的间隔大小超过假设的假设,具有静态较小 可能的间隙大小。
    • 17. 发明授权
    • Method for multi-class, multi-label categorization using probabilistic hierarchical modeling
    • 使用概率分层建模的多类,多标签分类方法
    • US07139754B2
    • 2006-11-21
    • US10774966
    • 2004-02-09
    • Cyril GoutteEric Gaussier
    • Cyril GoutteEric Gaussier
    • G06F17/30
    • G06F17/30707Y10S707/99933Y10S707/99934Y10S707/99935Y10S707/99936
    • A method of categorizing objects in which there can be multiple categories of objects and each object can belong to more than one category is described. The method defines a set of categories in which at least one category is dependent on another category and then organizes the categories in a hierarchy that embodies any dependencies among them. Each object is assigned to one or more categories in the set. A set of labels corresponding to all combinations of any number of the categories is defined, wherein if an object is relevant to several categories, the object must be assigned the label corresponding to the subset of all relevant categories. Once the new labels are defined, the multi-category, multi-label problem has been reduced to a multi-category, single-label problem, and the categorization task is reduced down to choosing the single best label set for an object.
    • 描述了可以存在多个类别的对象和每个对象可以属于多于一个类别的对象的分类方法。 该方法定义了一组类别,其中至少一个类别依赖于另一个类别,然后组织在体现其中的任何依赖关系的层次结构中的类别。 每个对象被分配到集合中的一个或多个类别。 定义对应于任何数量的类别的所有组合的一组标签,其中如果对象与若干类别相关,则该对象必须被分配与所有相关类别的子集相对应的标签。 一旦定义了新标签,多类别,多标签问题已经被减少到多类别的单标签问题,并且分类任务减少到为对象选择单个最佳标签集。
    • 19. 发明申请
    • Adaptive spam message detector
    • 自适应垃圾邮件检测器
    • US20060123083A1
    • 2006-06-08
    • US11002179
    • 2004-12-03
    • Cyril GouttePierre IsabelleEric GaussierStephen Kruger
    • Cyril GouttePierre IsabelleEric GaussierStephen Kruger
    • G06F15/16
    • H04L51/12G06Q10/107
    • Electronic content is filtered to identify spam using image and linguistic processing. A plurality of information type gatherers assimilate and output different message attributes relating to message content associated with an information type. A categorizer may have a plurality of decision makers for providing as output a message class for classifying the message data. A history processor records the message attributes and the class decision as part of the prior history information and/or modifies the prior history information to reflect changes to fixed data and/or probability data. A categorizer coalescer assesses the message class output by the set of decision makers together with optional user input for producing a class decision identifying whether the message data is spam.
    • 过滤电子内容以使用图像和语言处理识别垃圾邮件。 多个信息类型采集器吸收和输出与信息类型相关联的消息内容相关的不同消息属性。 分类器可以具有多个决策者,用于提供用于分类消息数据的消息类别作为输出。 历史处理器将消息属性和类决定记录为先前历史信息的一部分和/或修改先前历史信息以反映对固定数据和/或概率数据的改变。 分类器聚结器评估该组决策者输出的消息类别以及可选的用户输入,用于产生识别消息数据是否为垃圾邮件的类别决定。
    • 20. 发明授权
    • Text categorization based on co-classification learning from multilingual corpora
    • 基于多语言语料库的共同分类学习的文本分类
    • US08438009B2
    • 2013-05-07
    • US12909389
    • 2010-10-21
    • Massih AminiCyril Goutte
    • Massih AminiCyril Goutte
    • G06F17/20G06F17/27
    • G06F17/30707G06F17/2809
    • The present document describes a method and a system for generating classifiers from multilingual corpora including subsets of content-equivalent documents written in different languages. When the documents are translations of each other, their classifications must be substantially the same. Embodiments of the invention utilize this similarity in order to enhance the accuracy of the classification in one language based on the classification results in the other language, and vice versa. A system in accordance with the present embodiments implements a method which comprises generating a first classifier from a first subset of the corpora in a first language; generating a second classifier from a second subset of the corpora in a second language; and re-training each of the classifiers on its respective subset based on the classification results of the other classifier, until a training cost between the classification results produced by subsequent iterations reaches a local minima.
    • 本文档描述了一种用于从多语言语料库生成分类器的方法和系统,包括用不同语言编写的内容等效文档的子集。 当文件是彼此的翻译时,其分类必须基本相同。 本发明的实施例利用这种相似性,以便基于另一种语言的分类结果来提高一种语言中的分类的准确性,反之亦然。 根据本实施例的系统实现一种方法,其包括以第一语言从语料库的第一子集生成第一分类器; 从第二语言的语料库的第二子集生成第二分类器; 并且基于其他分类器的分类结果重新训练其各自子集上的每个分类器,直到由后续迭代产生的分类结果之间的训练成本达到局部最小值。