会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 10. 发明申请
    • SYSTEMS AND METHODS FOR LANGUAGE DETECTION
    • 用于语言检测的系统和方法
    • WO2018067440A1
    • 2018-04-12
    • PCT/US2017/054722
    • 2017-10-02
    • MACHINE ZONE, INC.
    • BOJJA, NikhilWANG, PidongGUO, Shiman
    • G06F17/27
    • Implementations of the present disclosure are directed to a method, a system, and a computer program storage device for identifying a language in a message. Non-language characters are removed from a text message to generate a sanitized text message. An alphabet and/or a script are detected in the sanitized text message by performing at least one of (i) an alphabet-based language detection test to determine a first set of scores and (ii) a script-based language detection test to determine a second set of scores. Each score in the first set of scores represents a likelihood that the sanitized text message includes the alphabet for one of a plurality of different languages. Each score in the second set of scores represents a likelihood that the sanitized text message includes the script for one of the plurality of different languages. The language in the sanitized text message is identified based on at least one of the first set of scores, the second set of scores, and a combination of the first and second sets of scores.
    • 本公开的实施方式涉及用于识别消息中的语言的方法,系统和计算机程序存储设备。 从文本消息中删除非语言字符以生成消毒文本消息。 通过执行(i)基于字母表的语言检测测试以确定第一组分数和(ii)基于脚本的语言检测测试中的至少一个以在消毒的文本消息中检测字母和/或脚本以确定 第二组分数。 第一组得分中的每个得分代表消毒文本消息包括多种不同语言之一的字母表的可能性。 第二组得分中的每个得分表示经过消毒的文本消息包括用于多种不同语言之一的脚本的可能性。 根据第一组分数,第二组分数以及第一组和第二组分数的组合中的至少一个来识别消毒文本消息中的语言。