专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

61. 发明授权

US07730396B2 Systems and methods for converting legacy and proprietary documents into extended mark-up language format 失效
标题翻译：将传统和专有文档转换为扩展标记语言格式的系统和方法
公开(公告)号：US07730396B2
公开(公告)日：2010-06-01
申请号：US11598083
申请日：2006-11-13
申请人： Boris Chidlovskii , Hervé Dejean
发明人： Boris Chidlovskii , Hervé Dejean
IPC分类号： G06F17/22
CPC分类号： G06F17/30914 , G06F17/227
摘要： A system and method that converts legacy and proprietary documents into extended mark-up language format which treats the conversion as transforming ordered trees of one schema and/or model into ordered trees of another schema and/or model. In embodiments, the tree transformers are coded using a learning method that decomposes the converting task into three components which include path re-labeling, structural composition and input tree traversal, each of which involves learning approaches. The transformation of an input tree into an output tree may involve decomposing the input document, labeling components in the input tree with valid labels or paths from a particular output schema, composing the labeled elements into the output tree with a valid structure, and finding such a traversal of the input tree that achieves the correct composition of the output tree and applies structural rules.
摘要翻译：将传统和专有文档转换为扩展标记语言格式的系统和方法，该格式将转换视为将一个模式和/或模型的有序树转换为另一模式和/或模型的有序树。在实施例中，使用将转换任务分解为包括路径重新标记，结构组合和输入树遍历的三个组件的学习方法对树型变换器进行编码，每个组件涉及学习方法。将输入树转换为输出树可能涉及分解输入文档，使用来自特定输出模式的有效标签或路径将输入树中的组件标记，将标记的元素组合成具有有效结构的输出树，并且找到这样的结果输入树的遍历，实现输出树的正确组合并应用结构规则。

62. 发明授权

US07296223B2 System and method for structured document authoring 有权
标题翻译：结构化文档创作的系统和方法
公开(公告)号：US07296223B2
公开(公告)日：2007-11-13
申请号：US10607667
申请日：2003-06-27
申请人： Boris Chidlovskii , Hervé Déjean
发明人： Boris Chidlovskii , Hervé Déjean
IPC分类号： G06F15/00 , G06F17/00
CPC分类号： G06F17/2282 , G06F17/2247 , G06F17/24 , Y10S707/99943
摘要： A method for creating a structured document, wherein a structured document comprises a plurality of content elements wrapped in pairs of tags, includes parsing a document of a particular type containing content into a plurality of content elements; and for each content element, suggesting an optimal tag according to a tag suggestion procedure. The tag suggestion procedure includes providing sample data which has been converted into a structured sample document; deriving a set of tags from the structured sample document; evaluating the set of tags according to tag suggestion criteria to determine an optimal tag for the content element. The optimal tag may be a single tag or a pattern of tags which maximizes a similarity function with patterns found in the sample data.
摘要翻译：一种用于创建结构化文档的方法，其中结构化文档包括被包裹成成对的标签的多个内容元素，包括将包含内容的特定类型的文档解析为多个内容元素; 并且针对每个内容元素，根据标签建议过程来提示最佳标签。标签建议程序包括提供已经转换成结构化样本文档的样本数据; 从结构化样本文档中导出一组标签; 根据标签建议标准评估标签集，以确定内容元素的最佳标签。最佳标签可以是单个标签或使样本数据中发现的图案最大化相似度函数的标签图案。

63. 发明申请

US20070150801A1 Interactive learning-based document annotation 有权
标题翻译：基于交互式学习的文档注释
公开(公告)号：US20070150801A1
公开(公告)日：2007-06-28
申请号：US11316771
申请日：2005-12-23
申请人： Boris Chidlovskii , Thierry Jacquin
发明人： Boris Chidlovskii , Thierry Jacquin
IPC分类号： G06F17/00 , G06F15/00
CPC分类号： G06F17/241 , G06F17/2247 , G06K9/6254
摘要： A document annotation system 10 includes a graphical user interface 22 used by an annotator 30 to annotate documents. An active learning component 24 trains an annotation model and proposes annotations to documents based on the annotation model. A request handler 26, 32, 34, 42 conveys annotation requests from the graphical user interface 22 to the active learning component 24, conveys proposed annotations from the active learning component 24 to the graphical user interface 22, and selectably conveys evaluation requests from the graphical user interface 22 to a domain expert 40. During annotation, at least some low probability proposed annotations are presented to the annotator 30 by the graphical user interface 22. The presented low probability proposed annotations enhance training of the annotation model by the active learning component 24.
摘要翻译：文档注释系统10包括由注释器30用于注释文档的图形用户界面22。主动学习组件24训练注释模型，并基于注释模型提出对文档的注释。请求处理器26,32,34,42将来自图形用户界面22的注释请求传送到主动学习组件24，将所提出的注释从主动学习组件24传送到图形用户界面22，并且可选地传达来自图形的评估请求在注释期间，至少一些低概率提出的注释由图形用户界面22呈现给注释器30.所呈现的低概率提出的注释增强了主动学习组件24对注释模型的训练。

64. 发明授权

US07171620B2 System and method for managing document retention of shared documents 失效
标题翻译：用于管理共享文档的文档保留的系统和方法
公开(公告)号：US07171620B2
公开(公告)日：2007-01-30
申请号：US10201717
申请日：2002-07-24
申请人： Stefania Castellani , Boris Chidlovskii
发明人： Stefania Castellani , Boris Chidlovskii
IPC分类号： G06F17/00 , G06F7/00 , G06F17/30
CPC分类号： G06Q10/10 , G06F17/30011 , Y10S707/99935
摘要： The visibility of shared documents in a collaborative recommender system is managed by analyzing both the document's substance and user actions that are performed on the document. The document's substance includes both metadata and content. User actions include both user ratings and semantic actions. The visibility of the shared documents is updated when either a user action or an event occurs.
摘要翻译：协同推荐系统中共享文档的可见性是通过分析文档的实质和对文档执行的用户操作进行管理的。该文档的实质包括元数据和内容。用户操作包括用户评级和语义动作。当用户操作或事件发生时，更新共享文档的可见性。

65. 发明申请

US20060085468A1 Method for automatic wrapper repair 有权
标题翻译：自动包装修复方法
公开(公告)号：US20060085468A1
公开(公告)日：2006-04-20
申请号：US11295367
申请日：2005-12-05
申请人： Boris Chidlovskii
发明人： Boris Chidlovskii
IPC分类号： G06F17/30
CPC分类号： G06F17/30893 , Y10S707/99931 , Y10S707/99932 , Y10S707/99942 , Y10S707/99945 , Y10S707/99948
摘要： A method of information extraction from a Web page using an initial wrapper which has become partially inoperative, wherein the initial wrapper comprises an initial set of rules for extracting information and for assigning labels from a wrapper set of labels to the extracted information, includes using the initial set of rules to extract strings from the Web page parsed in forward direction; analyzing the extracted strings according to the initial set of rules for assigning labels associated with the wrapper; assigning labels to those strings which satisfy the label rules; using the initial set of rules to extract strings from the Web page in backward/(opposite) direction; analyzing the extracted strings according to the set of rules for assigning labels associated with the wrappers; and assigning labels to those unlabeled strings from which satisfy the label rules.
摘要翻译：一种使用已经变得部分不起作用的初始包装器从网页提取信息的方法，其中初始包装器包括用于提取信息和从包装纸标签组分配标签到提取的信息的初始规则集，包括使用从向前解析的网页中提取字符串的初始规则集; 根据用于分配与包装器相关联的标签的初始规则集来分析提取的字符串; 为满足标签规则的字符串分配标签; 使用初始规则集在向后/（相反）方向从网页提取字符串; 根据用于分配与包装纸相关联的标签的规则集来分析提取的字符串; 并将标签分配给满足标签规则的那些未标记的字符串。

66. 发明申请

US20060074998A1 Method for automatic wrapper repair 有权
公开(公告)号：US20060074998A1
公开(公告)日：2006-04-06
申请号：US11294869
申请日：2005-12-05
申请人： Boris Chidlovskii
发明人： Boris Chidlovskii
IPC分类号： G06F17/30
CPC分类号： G06F17/30893 , Y10S707/99931 , Y10S707/99932 , Y10S707/99942 , Y10S707/99945 , Y10S707/99948
摘要： A method for repairing a wrapper associated with an information source, includes defining a classifier, based on content features of extracted and labeled information using the wrapper, using the classifier to extract content information from the file according to a set of classifier extraction rules; analyzing the extracted content information according to the content features and assigning a label to any extracted content information which satisfies the label's rules; and defining a repaired wrapper as the classifier and those labels in the set which have been assigned to extracted content information. Additional content information and labels can be extracted by iteratively creating a classifier based on both content features and structure features of extracted strings.

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式