会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Discriminative syntactic word order model for machine translation
    • 用于机器翻译的判别句法词序列模型
    • US08452585B2
    • 2013-05-28
    • US12061313
    • 2008-04-02
    • Kristina Nikolova ToutanovaPi-Chuan Chang
    • Kristina Nikolova ToutanovaPi-Chuan Chang
    • G06F17/27
    • G06F17/2818
    • A discriminatively trained word order model is used to identify a most likely word order from a set of word orders for target words translated from a source sentence. For each set of word orders, the discriminatively trained word order model uses features based on information in a source dependency tree and a target dependency tree and features based on the order of words in the word order. The discriminatively trained statistical model is trained by determining a translation metric for each of a set of N-best word orders for a set of target words. Each of the N-best word orders are projective with respect to a target dependency tree and the N-best word orders are selected using a combination of an n-gram language model and a local tree order model.
    • 使用歧视性训练的词序模型来识别从源语句翻译的目标词的一组单词顺序中最可能的单词顺序。 对于每个单词组,鉴别训练的词序模型基于源依赖树中的信息和目标依赖关系树和基于单词中单词顺序的特征来使用特征。 通过针对一组目标单词的一组N个最佳单词顺序中的每一个确定翻译度量来训练经歧视地训练的统计模型。 每个N最好的单词顺序相对于目标依赖关系树是投影的,并且使用n-gram语言模型和本地树顺序模型的组合来选择N个最好的单词顺序。
    • 2. 发明申请
    • Discriminative Syntactic Word Order Model for Machine Translation
    • 机器翻译判别语法词序模型
    • US20080319736A1
    • 2008-12-25
    • US12061313
    • 2008-04-02
    • Kristina Nikolova ToutanovaPi-Chuan Chang
    • Kristina Nikolova ToutanovaPi-Chuan Chang
    • G06F17/27
    • G06F17/2818
    • A discriminatively trained word order model is used to identify a most likely word order from a set of word orders for target words translated from a source sentence. For each set of word orders, the discriminatively trained word order model uses features based on information in a source dependency tree and a target dependency tree and features based on the order of words in the word order. The discriminatively trained statistical model is trained by determining a translation metric for each of a set of N-best word orders for a set of target words. Each of the N-best word orders are projective with respect to a target dependency tree and the N-best word orders are selected using a combination of an n-gram language model and a local tree order model.
    • 使用歧视性训练的词序模型来识别从源语句翻译的目标词的一组单词顺序中最可能的单词顺序。 对于每个单词组,鉴别训练的词序模型基于源依赖树中的信息和目标依赖关系树和基于单词中单词顺序的特征来使用特征。 通过针对一组目标单词的一组N个最佳单词顺序中的每一个确定翻译度量来训练经歧视地训练的统计模型。 每个N最好的单词顺序相对于目标依赖关系树是投影的,并且使用n-gram语言模型和本地树顺序模型的组合来选择N个最好的单词顺序。
    • 3. 发明授权
    • Semi-supervised part-of-speech tagging
    • 半监督的词性标签
    • US08275607B2
    • 2012-09-25
    • US11954212
    • 2007-12-12
    • Kristina Nikolova ToutanovaMark Edward Johnson
    • Kristina Nikolova ToutanovaMark Edward Johnson
    • G06F17/27
    • G06F17/2755
    • A word is selected from a received text and features are identified from the word. The features are applied to a model to identify probabilities for sets of part-of-speech tags. The probabilities for the sets of part-of-speech tags are used to weight scores for possible part-of-speech tags for the selected word to form weighted scores. The weighted scores are used to select a part-of-speech tag for the word and the selected part of speech tag is stored or output. The scores for the possible part-of-speech tags are based on variational approximation parameters trained from a sparse prior over probability distributions describing the probability of a part-of-speech tag given a word.
    • 从接收到的文本中选择一个单词,并从单词中识别特征。 这些特征被应用于模型以识别词性标签集合的概率。 部分语音标签集合的概率用于对所选词的可能词性标签加权分数以形成加权分数。 加权分数用于选择单词的词性标签,并存储或输出所选择的部分语音标签。 可能的词性标签的分数基于从稀疏先验概率分布训练的变分近似参数,描述了给定词的词性标签的概率。
    • 4. 发明申请
    • UNSUPERVISED CHINESE WORD SEGMENTATION FOR STATISTICAL MACHINE TRANSLATION
    • 不平衡中文字数统计机翻译
    • US20090326916A1
    • 2009-12-31
    • US12163119
    • 2008-06-27
    • Jianfeng GaoKristina Nikolova ToutanovaJia Xu
    • Jianfeng GaoKristina Nikolova ToutanovaJia Xu
    • G06F17/28
    • G06F17/2827G06F17/277G06F17/2818G06F17/2863
    • Described is using a generative model in processing an unsegmented sentence into a segmented sentence. A segmenter includes the generative model, which given an unsegmented sentence (e.g., in Chinese) provides candidate segmented sentences to a probability-based decoder that selects the segmented sentence. For example, the segmented (e.g., Chinese-language) sentence may be provided to a statistical machine translator that outputs a translated (e.g., English-language) sentence. The generative model may include a word sub-model that generates hidden words using a word model, a spelling sub-model that generates characters from the hidden words, and an alignment sub-model that generates translated words and alignment data from the characters. The word sub-model may correspond to a unigram model having words and associated frequency data therein, and the alignment sub-model may correspond to a word aligned corpus having source sentence, translated target sentence pairings therein. Training is also described.
    • 描述的是使用生成模型将未分段句子处理为分段句子。 分割器包括生成模型,其给出未分段句子(例如,中文),将候选分割句子提供给选择分割句子的基于概率的解码器。 例如,可以将分段(例如,中文)语句提供给输出翻译(例如,英语)句子的统计机器翻译器。 生成模型可以包括使用单词模型生成隐藏单词的单词子模型,从隐藏单词生成字符的拼写子模型以及从字符生成翻译单词和对齐数据的对齐子模型。 单词子模型可以对应于其中具有单词和相关联的频率数据的单格式模型,并且对齐子模型可以对应于具有源语句的单词对齐语料库,其中翻译的目标句子对。 培训也被描述。
    • 5. 发明申请
    • SEMI-SUPERVISED PART-OF-SPEECH TAGGING
    • 半监督的部分话题标签
    • US20090157384A1
    • 2009-06-18
    • US11954212
    • 2007-12-12
    • Kristina Nikolova ToutanovaMark Edward Johnson
    • Kristina Nikolova ToutanovaMark Edward Johnson
    • G06F17/28
    • G06F17/2755
    • A word is selected from a received text and features are identified from the word. The features are applied to a model to identify probabilities for sets of part-of-speech tags. The probabilities for the sets of part-of-speech tags are used to weight scores for possible part-of-speech tags for the selected word to form weighted scores. The weighted scores are used to select a part-of-speech tag for the word and the selected part of speech tag is stored or output. The scores for the possible part-of-speech tags are based on variational approximation parameters trained from a sparse prior over probability distributions describing the probability of a part-of-speech tag given a word.
    • 从接收到的文本中选择一个单词,并从单词中识别特征。 这些特征被应用于模型以识别词性标签集合的概率。 部分语音标签集合的概率用于对所选词的可能词性标签加权分数以形成加权分数。 加权分数用于选择单词的词性标签,并存储或输出所选择的部分语音标签。 可能的词性标签的分数基于从稀疏的先验概率分布训练的变分近似参数,描述了给定词的词性标签的概率。