会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明申请
    • KEYWORD EXPANSION METHOD AND SYSTEM, AND CLASSIFIED CORPUS ANNOTATION METHOD AND SYSTEM
    • 关键词扩展方法和系统,以及分类的公司注册方法和系统
    • US20160232211A1
    • 2016-08-11
    • US15025573
    • 2013-12-05
    • PEKING UNIVERSITY FOUNDER GROUP CO., LTD.FOUNDER APABI TECHNOLOGY LIMITEDPEKING UNIVERSITY
    • Mao YeZhi TangJianbo XuChao LeiLifeng Jin
    • G06F17/30
    • G06F16/24573G06F16/2455G06F16/3322G06F16/3338
    • A keyword expansion method and system are provided. The method comprises searching with a predetermined initial keyword to obtain current keywords used as a basis of a next search, performing loop search through keyword iteration. If a keyword error between keywords obtained in the current search and those keywords obtained in a previous search is less than a predetermined threshold, using the keywords obtained in the current search as expanded keywords of the initial keyword. With this method, the problem of manually establishing a thesaurus in the prior art may be solved. A method and system of automatically annotating a classified corpus is also provided. The method comprises: determining one or more initial core keywords for each class; obtaining expanded keywords for each class through expanding the initial core keywords; searching with the expanded keywords corresponding to a class to select a classified corpus and annotating the classified corpus.
    • 提供关键字扩展方法和系统。 该方法包括用预定的初始关键字搜索以获得当前关键字,用作下一次搜索的基础,通过关键词迭代执行循环搜索。 如果在当前搜索中获得的关键字和在先前搜索中获得的关键字之间的关键字错误小于预定阈值,则使用当前搜索中获得的关键字作为初始关键字的扩展关键字。 利用这种方法,可以解决现有技术中手工建立辞典的问题。 还提供了一种自动注释分类语料库的方法和系统。 该方法包括:确定每个类的一个或多个初始核心关键字; 通过扩展初始核心关键字获取每个类的扩展关键字; 搜索与类相对应的扩展关键字,以选择分类语料库并注释分类语料库。
    • 3. 发明授权
    • Apparatus and a method for logically processing a composite graph in a formatted document
    • 用于逻辑处理格式化文档中的复合图的装置和方法
    • US09569407B2
    • 2017-02-14
    • US14095682
    • 2013-12-03
    • Peking University Founder Group Co., Ltd.Founder Apabi Technology LimitedPeking University
    • Canhui XuZhi TangXin TaoCao Shi
    • G06F17/22G06F17/21G06F3/0484G06F17/27
    • G06F17/212G06F3/0483G06F3/0484G06F17/2705G06K9/00463
    • The present invention provides an apparatus for logically processing a composite graph in a formatted document, the apparatus comprising: a composite graph block extraction unit, used to extract a composite graph block in the formatted document; a document parsing unit, used to parse the formatted document to obtain a text element contained therein; a cutline element extraction unit, used to extract a cutline element from the text element; a correlativity detection unit, used to detect correlativity between the composite graph block and the cutline element; a correlativity storage unit, used to store the detected correlativity. The present invention also provides a method for logically processing a composite graph in a formatted document. According to the technical scheme disclosed in the present invention, it is easily achieve layout understanding of the composite graph in a graph-text mixed layout of the formatted document, so as to avoid a logical error.
    • 本发明提供一种用于逻辑处理格式化文档中的复合图形的装置,该装置包括:复合图块块提取单元,用于提取格式化文档中的复合图块; 文档解析单元,用于解析格式化的文档以获得其中包含的文本元素; 切割元素提取单元,用于从文本元素提取切割元素; 相关性检测单元,用于检测复合图形块和切割线元素之间的相关性; 相关性存储单元,用于存储检测到的相关性。 本发明还提供了一种在格式化文档中逻辑地处理复合图形的方法。 根据本发明公开的技术方案,可以容易地在格式化文档的图形文本混合布局中实现组合图的布局理解,以避免逻辑错误。
    • 4. 发明授权
    • Table recognizing method and table recognizing system
    • 表识别方法和表识别系统
    • US09268999B2
    • 2016-02-23
    • US14096532
    • 2013-12-04
    • Peking University Founder Group Co., Ltd.Founder Apabi Technology Limited
    • Canhui XuZhi TangJianbo XuXin Tao
    • G06K9/62G06K9/00
    • G06K9/00449G06K9/00463
    • Provided is a table recognizing method, comprising: parsing and analyzing metadata information in an original fixed-layout document, and extracting basic elements on a page of the document; segmenting the basic elements, extracting segmented text lines on the page, and acquiring fragments; constructing an undirected graph with respect to each of the fragments; extracting an image on the page, detecting intersection points of horizontal lines and vertical lines, detecting an external bounding box of the intersection points, and taking whether the segmented text lines fall within the external bounding box as local relationship features; training a learning model according to the local relationship features, local features of the fragments, and neighborhood relationship features among the fragments, acquiring model parameters, and establishing a table recognizing model; and invoking the table recognizing model to perform table recognizing for the document, and acquiring a recognizing result.
    • 提供了一种表识别方法,包括:解析和分析原始固定布局文档中的元数据信息,以及提取文档页面上的基本元素; 分割基本元素,在页面上提取分割的文本行,并获取片段; 构建关于每个片段的无向图; 提取页面上的图像,检测水平线和垂直线的交点,检测交点的外部边界框,以及分割的文本行是否落在外部边框内作为局部关系特征; 根据局部关系特征,片段的局部特征,片段间的邻域关系特征,获取模型参数,建立表识别模型,训练学习模型; 并调用表识别模型来执行文档的表识别,并获取识别结果。
    • 6. 发明申请
    • Apparatus And A Method For Logically Processing A Composite Graph In A Formatted Document
    • 用于逻辑处理格式化文档中的复合图形的装置和方法
    • US20140337719A1
    • 2014-11-13
    • US14095682
    • 2013-12-03
    • Peking University Founder Group Co., Ltd.Peking UniversityFounder Apabi Technology Limited
    • Canhui XuZhi TangXin TaoCao Shi
    • G06F3/0484G06F17/27
    • G06F17/212G06F3/0483G06F3/0484G06F17/2705G06K9/00463
    • The present invention provides an apparatus for logically processing a composite graph in a formatted document, the apparatus comprising: a composite graph block extraction unit, used to extract a composite graph block in the formatted document; a document parsing unit, used to parse the formatted document to obtain a text element contained therein; a cutline element extraction unit, used to extract a cutline element from the text element; a correlativity detection unit, used to detect correlativity between the composite graph block and the cutline element; a correlativity storage unit, used to store the detected correlativity. The present invention also provides a method for logically processing a composite graph in a formatted document. According to the technical scheme disclosed in the present invention, it is easily achieve layout understanding of the composite graph in a graph-text mixed layout of the formatted document, so as to avoid a logical error.
    • 本发明提供一种用于逻辑处理格式化文档中的复合图形的装置,该装置包括:复合图块块提取单元,用于提取格式化文档中的复合图块; 文档解析单元,用于解析格式化的文档以获得其中包含的文本元素; 切割元素提取单元,用于从文本元素提取切割元素; 相关性检测单元,用于检测复合图形块和切割线元素之间的相关性; 相关性存储单元,用于存储检测到的相关性。 本发明还提供了一种在格式化文档中逻辑地处理复合图形的方法。 根据本发明公开的技术方案,可以容易地在格式化文档的图形文本混合布局中实现组合图的布局理解,以避免逻辑错误。