会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 4. 发明授权
    • System and method for use in text analysis of documents and records
    • 用于文件和记录文本分析的系统和方法
    • US06665661B1
    • 2003-12-16
    • US09672599
    • 2000-09-29
    • Vernon L. CrowRandall E. ScarberryAugustin J. CalapristNancy E. MillerGrant C. NakamuraJeffrey D. Saffer
    • Vernon L. CrowRandall E. ScarberryAugustin J. CalapristNancy E. MillerGrant C. NakamuraJeffrey D. Saffer
    • G06F1730
    • G06F17/30616Y10S707/99932Y10S707/99933
    • Methods and systems are provided that enable text in various sections of data records to be separately catalogued, indexed, or vectorized for analysis in a text visualization and mining system. A text processing system receives a plurality of data records, where each data record has one or a plurality of attribute fields associated with the records. The attributes fields containing textual information are identified. The specific textual content of each attribute field is identified. An index is generated that associates the textual content contained in each attribute field with the attribute field containing the textual content. The index is operable for use in text processing. The plurality of data records may be located in a data table and the textual information may be contained within cells of the data table. In another aspect, a plurality of data records is received, where at least some of the data records contain text terms. A first method is applied to weight text terms of the data records in a first manner to aid in distinguishing records from each other in response to selection of the first method. A second method is applied to weight text terms of the data records in a second manner to aid in distinguishing records from each other in response to selection of the second method. A vector is generated to distinguish each of the data records based on the text terms weighted by either the first or second method.
    • 提供了方法和系统,使数据记录的各个部分的文本可以单独编目,索引或向量化,以便在文本可视化和挖掘系统中进行分析。 文本处理系统接收多个数据记录,其中每个数据记录具有与记录相关联的一个或多个属性字段。 标识包含文本信息的属性字段。 识别每个属性字段的特定文本内容。 生成将每个属性字段中包含的文本内容与包含文本内容的属性字段相关联的索引。 该索引可操作用于文本处理。 多个数据记录可以位于数据表中,并且文本信息可以包含在数据表的单元内。 在另一方面,接收多个数据记录,其中至少一些数据记录包含文本术语。 应用第一种方法以第一种方式对数据记录的文本术语进行加权,以帮助响应于第一种方法的选择来区分记录。 应用第二种方法以第二种方式对数据记录的文本术语进行加权,以帮助响应于第二种方法的选择来区分记录。 生成矢量以基于由第一或第二方法加权的文本项来区分每个数据记录。