会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 41. 发明申请
    • BOOTSTRAPPING TEXT CLASSIFIERS BY LANGUAGE ADAPTATION
    • 通过语言适应来引导文本分类器
    • WO2011100862A1
    • 2011-08-25
    • PCT/CN2010/000225
    • 2010-02-22
    • YAHOO! INC.SHI, LeiTIAN, Mingjun
    • SHI, LeiTIAN, Mingjun
    • G06F17/28
    • G06F17/30705
    • Training data in one language is leveraged to develop classifiers for multiple languages under circumstances where all of those classifiers will be performing the same kind of classification task, but relative to linguistically different sets of texts, thereby saving the cost of manually labeling a different set of training data for each language. Classification knowledge is learned for a source language in which training data are available. That knowledge is transferred to another target language's classifier through the integration of language transition knowledge. The transferred model is adjusted to better fit the target language. In one technique, leveraging one language's classification knowledge in order to generate a classifiers for another language involves training a text classifier in a source language, transferring the learned classification knowledge from the source language to another target language using language translation techniques, and further tuning the transferred model to better fit the target language text.
    • 在所有这些分类器都将执行相同类别的分类任务,但相对于语言上不同的文本集合的情况下,利用一种语言的培训数据来开发多种语言的分类器,从而节省手动标记不同集合的成本 培训每种语言的数据。 对于有可用的培训数据的源语言,将学习分类知识。 该知识通过语言转换知识的整合转移到另一个目标语言的分类器。 转移的模型被调整以更好地适合目标语言。 在一种技术中,利用一种语言的分类知识来生成用于另一种语言的分类器涉及用源语言来训练文本分类器,使用语言翻译技术将学习的分类知识从源语言转移到另一目标语言,并进一步调整 转移模型更好地适应目标语言文本。
    • 43. 发明申请
    • ACQUISITION OF OUT-OF-VOCABULARY TRANSLATIONS BY DYNAMICALLY LEARNING EXTRACTION RULES
    • 通过动态学习提取规则获取非规定翻译
    • WO2011035455A1
    • 2011-03-31
    • PCT/CN2009/001078
    • 2009-09-25
    • YAHOO! INC.SHI, Lei
    • SHI, Lei
    • H04W4/00
    • G06F17/2827G06F17/2854
    • A method and apparatus for identifying a set of bilingual term pairs, and from the set of bilingual term pairs, identifying a set of candidate patterns related to the layout of the bilingual term pairs in the bilingual webpage is provided. From the set of candidate patterns, one or more best patterns can be selected based on features identified in the candidate patterns. Using the one or more selected patterns, a set of translation pair candidates can be extracted. The translation pair candidates can be verified to determine the likelihood that each translation pair candidate is an accurate translation. Based on the verification, some or all of the translation pair candidates can be discarded as incorrect translations, and the remaining translation pair candidates can be identified as correct translation pairs.
    • 提供了一种用于识别一组双语词语对的方法和装置,以及从双语词组对的集合中,识别与双语页面中的双语词语对的布局有关的一组候选模式。 根据候选模式集合,可以基于在候选模式中识别的特征来选择一个或多个最佳模式。 使用一个或多个所选择的模式,可以提取一组翻译对候选。 可以验证翻译对候选者以确定每个翻译对候选者是准确翻译的可能性。 基于验证,一些或全部翻译对候选可以被丢弃为不正确的翻译,并且剩余的翻译对候选可以被识别为正确的翻译对。