会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明申请
    • ORDERING DOCUMENT CONTENT
    • 订购文件内容
    • WO2012099801A4
    • 2012-09-20
    • PCT/US2012021385
    • 2012-01-13
    • APPLE INCMANSFIELD PHILIP ANDREWLEVY MICHAEL ROBERTCLEGG DEREK B
    • MANSFIELD PHILIP ANDREWLEVY MICHAEL ROBERTCLEGG DEREK B
    • G06F17/00
    • G06F17/2229G06F17/212G06F17/2241
    • For a page that has been decomposed into a set of primitive areas, a novel method for organizing the set of primitive areas into an ordered list is disclosed. The primitive areas in the ordered list are initially sorted using start point order relation ordering, which compares the start points of the primitive areas in the coordinate system of the page. The ordering of the primitive areas in the ordered list are then refined by using contextual order relation ordering, which compares primitive areas against each other according to coordinate systems local to the primitive areas being compared. A new ordered list is then created by transposing primitive areas that are incorrectly ordered according to contextual order relation ordering.
    • 对于已被分解为一组原始区域的页面,公开了一种将原始区域组合成有序列表的新方法。 有序列表中的原始区域首先使用起始点顺序关系排序进行排序,起始点顺序关系顺序比较页面坐标系中原始区域的起点。 然后通过使用上下文顺序关系排序来细化有序列表中的基元区域的排序,该上下文排序关系排序根据被比较的基元区域本地的坐标系将基元区域彼此进行比较。 然后通过调换根据上下文顺序关系排序而不正确排序的原始区域来创建新的有序列表。
    • 9. 发明申请
    • METHODS AND SYSTEM FOR DOCUMENT RECONSTRUCTION
    • 文件重建方法和系统
    • WO2010078475A3
    • 2011-04-14
    • PCT/US2009069885
    • 2009-12-31
    • APPLE INCMANSFIELD PHILIP ANDREWLEVY MICHAEL ROBERTCLEGG DEREK B
    • MANSFIELD PHILIP ANDREWLEVY MICHAEL ROBERTCLEGG DEREK B
    • G06F17/27
    • G06F17/2294G06F17/21G06F17/211G06F17/212G06F17/218G06F17/2217G06F17/2247G06F17/243G06F17/248G06F17/2705G06F17/28G06F17/30011G06K9/00456G06K9/00463
    • Different embodiments of the invention use different techniques for analyzing an unstructured document to define a structured document. The unstructured document includes numerous primitive elements, but does not include structural elements that specify the structural relationship between the primitive elements and/or structural attributes of the document based on these primitive elements. To define the structured document, the primitive elements of the unstructured document are used to identify various geometric attributes of the unstructured document. The identified geometric attributes and other attributes of the primitive elements are used to define structural elements, such as associated primitive elements (e.g., words, paragraphs, joined graphs, etc.), tables, guides, gutters, etc, as well as to define the flow of reading through the primitive and structural elements. Various methods to enhance the efficiency of the geometric analysis and document reconstruction processes, ( e.g., hierarchical profiling, efficient cluster analysis techniques, efficient data structures) are provided.
    • 本发明的不同实施例使用不同的技术来分析非结构化文档以定义结构化文档。 非结构化文档包括许多基本元素,但不包括基于这些基本元素指定基本元素和/或文档的结构属性之间的结构关系的结构元素。 为了定义结构化文档,非结构化文档的基本元素被用于识别非结构化文档的各种几何属性。 所识别的基本元素的几何属性和其他属性用于定义结构元素,例如相关联的基元(例如,单词,段落,连接图等),表格,引导,排水沟等,以及定义 通过原始和结构元素的阅读流程。 提供了各种提高几何分析和文档重建过程(例如分层分析,有效聚类分析技术,高效数据结构)效率的方法。