会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Machine translation system using well formed substructures
    • 机器翻译系统使用良好的子结构
    • US5848385A
    • 1998-12-08
    • US562686
    • 1995-11-27
    • Victor PoznanskiJohn Luis BeavenIan George Johnson
    • Victor PoznanskiJohn Luis BeavenIan George Johnson
    • G06F17/27G06F17/28
    • G06F17/271G06F17/2755G06F17/277G06F17/2872
    • Source language text from an input interface is broken down into source language morphemes by a morphological analyzer. A syntactic analyzer converts the morphemes into source language signs labelled with identifiers and data identifying other signs which are grammatically related. A bilingual equivalence transformer transforms the source language signs to target language signs which are combined by a combiner to provide a first attempt at a target language structure. The structure is repeatedly evaluated by an evaluator and transformed by a transformer. The signs of well formed substructures identified by the evaluator are not dissociated from each other by the transformer. This process ends when either the whole target language structure is evaluated as being well formed or all transformations have been unsuccessfully evaluated.
    • 输入界面的源语言文本由形态分析器分解为源语言语素。 句法分析器将语素转换为标示标识符的源语言符号,并标识与语法相关的其他符号。 双语等效变换器将源语言符号转换为由组合器组合的目标语言符号,以提供目标语言结构的第一尝试。 该结构由评估者重复评估并由变压器变换。 由评估者识别的形成良好的子结构的迹象不会被变压器彼此解离。 当整个目标语言结构被评估为正确形成或者所有转换都被成功评估时,该过程就会结束。
    • 2. 发明授权
    • Language identification in multilingual text
    • 多语言文字中的语言识别
    • US08635061B2
    • 2014-01-21
    • US12904642
    • 2010-10-14
    • Kang LiStephen Allen KloderIan George JohnsonSiarhei Alonichau
    • Kang LiStephen Allen KloderIan George JohnsonSiarhei Alonichau
    • G06F17/20G06F17/27G10L15/00
    • G06F17/275G06F17/30864
    • Methods, systems, and media are provided for identifying languages in multilingual text. A document is decoded into a universal representative coding for easier tag manipulation, then broken into plain-text content sections. The sections are identified and assigned a weight, wherein more informative sections are given a higher weight and less informative sections are given a lesser weight. A language likelihood score is determined for each word, phrase, or character n-gram in a section. The language likelihood scores within a section are combined for each language. The combined section scores are then summed together to obtain a total document score for each language. This results in a document score for each language, which can be ranked to determine the primary language for the document.
    • 提供方法,系统和媒体用于识别多语言文本中的语言。 将文档解码为通用代表编码,便于标签操纵,然后分解成纯文本内容部分。 这些部分被识别并分配了一个重量,其中更多的信息部分被给予较高的重量,并且较少的信息部分被给予较小的重量。 确定一个部分中每个单词,短语或字符n-gram的语言可能性得分。 一个部分内的语言可能性分数与每种语言相结合。 然后将组合的分数相加在一起以获得每种语言的总文档分数。 这导致每个语言的文档分数,其可以被排序以确定文档的主要语言。