会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明授权
    • Consistency checker for documents containing japanese text
    • 包含日文文本的一致性检查器
    • US06175834B1
    • 2001-01-16
    • US09104257
    • 1998-06-24
    • Patrick Pei CaiPatrick H. Halstead
    • Patrick Pei CaiPatrick H. Halstead
    • G06F1700
    • G06F17/273G06F17/2863Y10S707/917Y10S707/99943
    • A Consistency Checker provides an improved method of analyzing a Japanese text document to identify inconsistently spelled words. The Consistency Checker utilizes a Reading Pair Database (RPD) and a Compressed Lexicon Database (CLD) to determine the reading units within a word, to calculate a Reading Pair Identification Number (RID) for each reading unit, to calculate a Sense Identification Number (SID) for each word, and to calculate a Spelling Variant Identification Number (SVID) for each word. Spelling variants are generated by combining variations of individual RIDs in the RID array. A Registry is updated to maintain statistics on all of the words within the document. An error field within the Registry indicates that the document contains more than one spelling variant of a particular word. The client program can access the Registry to alert a user to inconsistencies discovered in the document. The RPD comprises a list of reading pairs correlating Japanese text reading units of one character set with equivalent Japanese text reading units of another character set. Equivalent reading units from each character set are combined to form a reading pair and each reading pair is assigned a RID. A method is provided for generating the RPD by analyzing a list of Japanese words and a list of Japanese word equivalents having different spellings. Reading units are discovered by splitting the words at common dividing points and eliminating low-occurrence reading units until a set of high-occurrence reading units is defined.
    • 一致性检查器提供了一种改进的方法来分析日语文本文档以识别不一致的拼写单词。 一致性检查器使用阅读对数据库(RPD)和压缩词典数据库(CLD)来确定单词内的读数单位,以计算每个阅读单元的阅读对识别号码(RID),以计算感知识别号码 SID),并计算每个单词的拼写变体识别号(SVID)。 通过组合RID阵列中各个RID的变体来生成拼写变体。 更新注册表以维护文档中所有单词的统计信息。 注册表中的错误字段指示该文档包含特定单词的多个拼写变体。 客户端程序可以访问注册表以提醒用户在文档中发现的不一致。 RPD包括将一个字符集的日文文本读取单元与另一个字符集的等效日文文本读取单元相关联的读取对的列表。 来自每个字符集的等效读取单元被组合以形成读取对,并且每个读取对被分配RID。 提供了一种用于通过分析日语单词列表和具有不同拼写的日语单词对等列表来生成RPD的方法。 通过在公共分割点分割单词并消除低出现的读数单位直到发现一组高出现的读数单位来发现读单元。