会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 7. 发明授权
    • Methods and systems to train models to extract and integrate information from data sources
    • 用于训练模型的方法和系统,以从数据源中提取和整合信息
    • US08805861B2
    • 2014-08-12
    • US12467235
    • 2009-05-15
    • Justin BoyanGlenn McDonaldMargaret BenthallRay Molnar
    • Justin BoyanGlenn McDonaldMargaret BenthallRay Molnar
    • G06F7/00
    • G06F17/3089
    • Methods and systems to model and acquire data from a variety of data and information sources, to integrate the data into a structured database, and to manage the continuing reintegration of updated data from those sources over time. For any given domain, a variety of individual information and data sources that contain information relevant to the schema can be identified. Data elements associated with a schema may be identified in a training source, such as by user tagging. A formal grammar may be induced appropriate to the schema and layout of the training source. A Hidden Markov Model (HMM) corresponding to the grammar may learn where in the sources the elements can be found. The system can automatically mutate its schema into a grammar matching the structure of the source documents. By following an inverse transformation sequence, data that is parsed by the mutated grammar can be fit back into the original grammar structure, matching the original data schema defined through domain modeling. Features disclosed herein may be implemented with respect to web-scraping and data acquisition, and to represent data in support of data-editing and data-merging tasks. A schema may be defined with respect to a graph-based domain model.
    • 从各种数据和信息来源建模和获取数据的方法和系统,将数据整合到结构化数据库中,并管理随着时间的推移,更新数据的更新数据。 对于任何给定的域,可以识别包含与模式相关的信息的各种个人信息和数据源。 与模式相关联的数据元素可以在训练源中被识别,例如通过用户标记。 可能引起适合于训练来源的模式和布局的形式语法。 对应于语法的隐马尔可夫模型(HMM)可以了解在源中可以找到元素的位置。 系统可以自动将其模式变成与源文档结构匹配的语法。 通过遵循逆变换序列,由突变语法解析的数据可以适应原始语法结构,匹配通过域建模定义的原始数据模式。 本文中公开的特征可以相对于网络刮擦和数据采集来实现,并且表示支持数据编辑和数据合并任务的数据。 可以针对基于图的域模型来定义模式。
    • 10. 发明申请
    • METHODS AND SYSTEMS TO TRAIN MODELS TO EXTRACT AND INTEGRATE INFORMATION FROM DATA SOURCES
    • 用于培养模型以从数据源中提取和整合信息的方法和系统
    • US20100145902A1
    • 2010-06-10
    • US12467235
    • 2009-05-15
    • Justin BOYANGlenn McDonaldMargaret BenthallRay Molnar
    • Justin BOYANGlenn McDonaldMargaret BenthallRay Molnar
    • G06N5/02G06F3/048G06F17/00G06F17/30G06F15/18
    • G06F17/3089
    • Methods and systems to model and acquire data from a variety of data and information sources, to integrate the data into a structured database, and to manage the continuing reintegration of updated data from those sources over time. For any given domain, a variety of individual information and data sources that contain information relevant to the schema can be identified. Data elements associated with a schema may be identified in a training source, such as by user tagging. A formal grammar may be induced appropriate to the schema and layout of the training source. A Hidden Markov Model (HMM) corresponding to the grammar may learn where in the sources the elements can be found. The system can automatically mutate its schema into a grammar matching the structure of the source documents. By following an inverse transformation sequence, data that is parsed by the mutated grammar can be fit back into the original grammar structure, matching the original data schema defined through domain modeling. Features disclosed herein may be implemented with respect to web-scraping and data acquisition, and to represent data in support of data-editing and data-merging tasks. A schema may be defined with respect to a graph-based domain model.
    • 从各种数据和信息来源建模和获取数据的方法和系统,将数据整合到结构化数据库中,并管理随着时间的推移,更新数据的更新数据。 对于任何给定的域,可以识别包含与模式相关的信息的各种个人信息和数据源。 与模式相关联的数据元素可以在训练源中被识别,例如通过用户标记。 可能引起适合于训练来源的模式和布局的形式语法。 对应于语法的隐马尔可夫模型(HMM)可以了解在源中可以找到元素的位置。 系统可以自动将其模式变成与源文档结构匹配的语法。 通过遵循逆变换序列,由突变语法解析的数据可以适应原始语法结构,匹配通过域建模定义的原始数据模式。 本文中公开的特征可以相对于网络刮擦和数据采集来实现,并且表示支持数据编辑和数据合并任务的数据。 可以针对基于图的域模型来定义模式。