专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

41. 发明申请

US20120284381A1 SYSTEMS, METHODS AND DEVICES FOR EXTRACTING AND VISUALIZING USER-CENTRIC COMMUNITIES FROM EMAILS 有权
标题翻译：从电子邮件中提取和查看用户中心社区的系统，方法和设备
公开(公告)号：US20120284381A1
公开(公告)日：2012-11-08
申请号：US13099502
申请日：2011-05-03
申请人： Boris Chidlovskii
发明人： Boris Chidlovskii
IPC分类号： G06F15/173
CPC分类号： G06Q10/10 , H04L51/32
摘要： Embodiments generally relate to systems and methods for extracting and visualizing user-centric communities from emails. A set of email data comprising a set of users can be identified and a communication graph comprising a center node can be generated from the email data. The center node can be removed from the communication graph and a set of communities can be determined from the remaining data. The center node can be reconnected to a center of each of the set of communities to form a community graph. The links connecting the center node with the center of each of the set of communities can have a weight calculated according to a formula. The community graph can be visualized and provided to an administrator.
摘要翻译：实施例通常涉及用于从电子邮件中提取和可视化以用户为中心的社区的系统和方法。可以识别包括一组用户的一组电子邮件数据，并且可以从电子邮件数据生成包括中心节点的通信图。可以从通信图中删除中心节点，并且可以从剩余数据确定一组社区。中心节点可以重新连接到每组社区的中心，以形成社区图。连接中心节点与每组群体的中心的链接可以具有根据公式计算的权重。社区图可以被可视化并提供给管理员。

42. 发明申请

US20120219191A1 LOCAL METRIC LEARNING FOR TAG RECOMMENDATION IN SOCIAL NETWORKS 有权
标题翻译：在社会网络中进行标签推荐的本地学习方法
公开(公告)号：US20120219191A1
公开(公告)日：2012-08-30
申请号：US13036209
申请日：2011-02-28
申请人： Mohamed Aymen Benzarti , Boris Chidlovskii , Nishant Vijayakumar
发明人： Mohamed Aymen Benzarti , Boris Chidlovskii , Nishant Vijayakumar
IPC分类号： G06K9/00 , G06F15/16
CPC分类号： G06Q30/0201 , G06K9/00677 , G06Q50/01
摘要： A tag recommendation for an item to be tagged is generated by: selecting a set of candidate neighboring items in an electronic social network based on context of items in the electronic social network respective to an owner of the item to be tagged; selecting a set of nearest neighboring items from the set of candidate neighboring items based on distances of the candidate neighboring items from the item to be tagged as measured by an item comparison metric; and selecting at least one tag recommendation based on tags of the items of the set of nearest neighboring items. The item comparison metric may comprise a Mahalanobis distance metric trained on the set of candidate neighboring items to correlate the trained Mahalanobis distance between pairs of items of the set of candidate neighboring items with an overlap metric indicative of overlap of the tag sets of the two items.
摘要翻译：通过以下方式生成要标记的商品的标签推荐：基于与要标记的商品的所有者相关联的电子社交网络中的项目的上下文来选择电子社交网络中的一组候选邻居项目; 基于通过项目比较度量测量的候选相邻项目与要标记的项目的距离，从所述候选相邻项目集合中选择一组最近邻项目; 以及基于所述一组最近邻项目的项目的标签来选择至少一个标签推荐。项目比较度量可以包括在所述候选相邻项目的集合上训练的马哈拉诺比斯距离度量，以将所述候选相邻项目组中的项目对之间的训练马哈拉诺比斯距离与指示两个项目的标签组的重叠的重叠度量相关联。

43. 发明申请

US20090018995A1 Semi-supervised visual clustering 有权
标题翻译：半监督视觉聚类
公开(公告)号：US20090018995A1
公开(公告)日：2009-01-15
申请号：US11827770
申请日：2007-07-13
申请人： Boris Chidlovskii , Loic Lecerf
发明人： Boris Chidlovskii , Loic Lecerf
IPC分类号： G06F17/30
CPC分类号： G06K9/6253 , G06K9/622 , G06K9/6251
摘要： A clustering system includes a visual mapping sub-system configured to display an N-dimensional to two- or three-dimensional mapping of items to be clustered, where N is greater than three, the mapping having mapping parameters for the N-dimensions. A user interface sub-system is configured to receive user inputted values for the mapping parameters, user inputted values selecting whether selected mapping parameters are fixed or adjustable, and user inputted values associating selected items with selected groups. An adjustment sub-system is configured to adjust the adjustable mapping parameters, without adjusting any fixed mapping parameters, to improve a measure of distinctness of one or more groups of items in the two- or three-dimensional mapping.
摘要翻译：聚类系统包括视觉映射子系统，被配置为显示要聚类的项目的N维到二维或三维映射，其中N大于3，所述映射具有N维的映射参数。用户接口子系统被配置为接收用于映射参数的用户输入值，用户输入的值选择所选择的映射参数是固定的或可调整的，以及用户输入的值将所选择的项目与所选择的组相关联。调整子系统被配置为在不调整任何固定的映射参数的情况下调整可调整的映射参数，以改进二维或三维映射中的一组或多组项目的区分度量。

44. 发明授权

US07440974B2 Method for automatic wrapper repair 有权
标题翻译：自动包装修复方法
公开(公告)号：US07440974B2
公开(公告)日：2008-10-21
申请号：US11295367
申请日：2005-12-05
申请人： Boris Chidlovskii
发明人： Boris Chidlovskii
IPC分类号： G06F17/30
CPC分类号： G06F17/30893 , Y10S707/99931 , Y10S707/99932 , Y10S707/99942 , Y10S707/99945 , Y10S707/99948
摘要： A method of information extraction from a Web page using an initial wrapper which has become partially inoperative, wherein the initial wrapper comprises an initial set of rules for extracting information and for assigning labels from a wrapper set of labels to the extracted information, includes using the initial set of rules to extract strings from the Web page parsed in forward direction; analyzing the extracted strings according to the initial set of rules for assigning labels associated with the wrapper; assigning labels to those strings which satisfy the label rules; using the initial set of rules to extract strings from the Web page in backward/(opposite) direction; analyzing the extracted strings according to the set of rules for assigning labels associated with the wrappers; and assigning labels to those unlabeled strings from which satisfy the label rules.
摘要翻译：一种使用已经变得部分不起作用的初始包装器从网页提取信息的方法，其中初始包装器包括用于提取信息并从包装纸标签组分配标签到提取的信息的初始规则集，包括使用从向前解析的网页中提取字符串的初始规则集; 根据用于分配与包装器相关联的标签的初始规则集来分析提取的字符串; 为满足标签规则的字符串分配标签; 使用初始规则集在向后/（相反）方向从网页提取字符串; 根据用于分配与包装纸相关联的标签的规则集来分析提取的字符串; 并将标签分配给满足标签规则的那些未标记的字符串。

45. 发明申请

US20080147574A1 Active learning methods for evolving a classifier 失效
标题翻译：用于演化分类器的主动学习方法
公开(公告)号：US20080147574A1
公开(公告)日：2008-06-19
申请号：US11638732
申请日：2006-12-14
申请人： Boris Chidlovskii
发明人： Boris Chidlovskii
IPC分类号： G06F15/18
CPC分类号： G06F17/30705 , G06N99/005
摘要： A method and system are provided for classifying data items such as a document based upon identification of element instances within the data item. A training set of classes is provided where each class is associated with one or more features indicative of accurate identification of an element instance within the data item. Upon the identification of the data item with the training set, a confidence factor is computed that the selected element instance is accurately identified. When a selected element instance has a low confidence factor, the associated features for the predicted class are changed by an annotator/expert so that the changed class definition of the new associated feature provides a higher confidence factor of accurate identification of element instances within the data item.
摘要翻译：提供了一种方法和系统，用于基于数据项内的元素实例的识别来对诸如文档的数据项进行分类。提供了一组训练集，其中每个类与指示数据项内的元素实例的准确识别的一个或多个特征相关联。在用训练集识别数据项后，计算出所选择的元素实例被准确地识别的置信因子。当所选择的元素实例具有低置信因子时，由注释器/专家改变预测类的相关特征，使得新关联特征的改变的类定义提供了数据内元素实例的精确识别的更高置信因子项目。

46. 发明申请

US20070150443A1 Document alignment systems for legacy document conversions 有权
标题翻译：用于旧文档转换的文档对齐系统
公开(公告)号：US20070150443A1
公开(公告)日：2007-06-28
申请号：US11315458
申请日：2005-12-22
申请人： Andre Bergholz , Boris Chidlovskii
发明人： Andre Bergholz , Boris Chidlovskii
IPC分类号： G06F17/30
CPC分类号： G06F17/30569
摘要： A method for aligning documents which may be in different XML formats includes inputting source and target leaves of a source and documents in first and second tree structured formats and assigning a cost to each of a plurality of matches. Each match may include a source leaf and a target leaf or be an unmatched source or target leaf. Matches are identified for which a total cost is minimal, wherein each of the leaves is in at least one of the identified matches. From the identified matches, groups of two or more matches are identified which have a leaf in common. From the groups, probable matches are identified in which more that one target leaf is matched with at least one source leaf or more than one source leaf is matched with a target leaf. An alignment between leaves of the target document and leaves of the source document is output which includes the probable matches.
摘要翻译：用于对准可以具有不同XML格式的文档的方法包括以第一和第二树结构格式输入源和文档的源和目标叶，并为多个匹配中的每一个分配成本。每个匹配可以包括源叶和目标叶，或者是不匹配的源或目标叶。识别匹配，其总成本最小，其中每个叶片处于所识别的匹配中的至少一个中。从识别的匹配中，识别出具有共同叶的两个或更多个匹配的组。从组中鉴定出可能的匹配，其中更多的一个目标叶与至少一个源叶匹配或多于一个源叶与目标叶匹配。输出目标文件的叶子和源文档的叶片之间的对齐，其包括可能的匹配。

47. 发明申请

US20070022373A1 Probabilistic learning method for XML annotation of documents 有权
标题翻译：用于文档XML注释的概率学习方法
公开(公告)号：US20070022373A1
公开(公告)日：2007-01-25
申请号：US11170542
申请日：2005-06-29
申请人： Boris Chidlovskii , Jerome Fuselier
发明人： Boris Chidlovskii , Jerome Fuselier
IPC分类号： G06F17/00
CPC分类号： G06F17/241 , G06F17/2247
摘要： A document processor includes a parser that parses a document using a grammar having a set of terminal elements for labeling leaves, a set of non terminal elements for labeling nodes, and a set of transformation rules. The parsing generates a parsed document structure including terminal element labels for fragments of the document and a nodes tree linking the terminal element labels and conforming with the transformation rules. An annotator-annotates the document with structural information based on the parsed document structure.
摘要翻译：文档处理器包括使用具有用于标记叶子的一组终端元素的语法来解析文档的解析器，用于标记节点的一组非终端元素以及一组转换规则。解析生成解析文档结构，其中包括用于文档片段的终端元素标签，以及链接终端元素标签并符合转换规则的节点树。注释器 - 根据解析的文档结构对结构信息进行注释。

48. 发明申请

US20060288275A1 Method for classifying sub-trees in semi-structured documents 审中-公开
标题翻译：在半结构化文件中分类子树的方法
公开(公告)号：US20060288275A1
公开(公告)日：2006-12-21
申请号：US11156776
申请日：2005-06-20
申请人： Boris Chidlovskii , Jerome Fuselier
发明人： Boris Chidlovskii , Jerome Fuselier
IPC分类号： G06F17/00
CPC分类号： G06F16/83
摘要： A method and system for classifying semi-structured documents by distinguishing sub-tree structural information as a distinct representative characteristic of a fragment of the document structure identified by a sub-tree node therein. The structural information comprises both an inner structure and an outer structure which individually can be exploited as representative data in a probabilistic classifier for classifying the sub-tree itself or the entire document. Additional representative feature data can also be independently used for classification and comprises the data content of the fragment structurally represented by the sub-tree and additionally with node attributes. The classification values independently generated from each of the different sets of features can then be combined in an assembly classifier to generate an automated classification system.
摘要翻译：通过将子树结构信息区分为由其中的子树节点识别的文档结构的片段的不同代表特征来对半结构化文档进行分类的方法和系统。结构信息包括内部结构和外部结构，其可以单独地作为用于对子树本身或整个文档进行分类的概率分类器中的代表性数据。附加的代表性特征数据也可以独立地用于分类，并且包括由子树结构地表示的片段的数据内容，并且还包括节点属性。然后可以将每个不同特征集合独立地生成的分类值组合在组件分类器中以生成自动分类系统。

49. 发明申请

US20060074999A1 Method for automatic wrapper repair 有权
标题翻译：自动包装修复方法
公开(公告)号：US20060074999A1
公开(公告)日：2006-04-06
申请号：US11294870
申请日：2005-12-05
申请人： Boris Chidlovskii
发明人： Boris Chidlovskii
IPC分类号： G06F17/30
CPC分类号： G06F17/30893 , Y10S707/99931 , Y10S707/99932 , Y10S707/99942 , Y10S707/99945 , Y10S707/99948
摘要： A method of information extraction from a Web page using a broken wrapper, includes using the wrapper to extract strings from the Web page parsed in forward direction; analyzing the extracted strings according to a set of rules for assigning labels associated with the wrapper; assigning labels to those strings which satisfy the label rules; classifying the extracted strings based on content features of the labeled extracted strings; validating those labeled extracted strings which satisfy the label rules within some threshold value.
摘要翻译：使用破损的包装器从网页提取信息的方法包括使用包装器从正向解析的网页中提取字符串; 根据用于分配与包装器相关联的标签的一组规则来分析提取的字符串; 为满足标签规则的字符串分配标签; 基于标记的提取字符串的内容特征对提取的字符串进行分类; 验证符合标准规则的那些标记的提取字符串在某个阈值内。

50. 发明授权

US06792576B1 System and method of automatic wrapper grammar generation 有权
标题翻译：自动包装语法生成的系统和方法
公开(公告)号：US06792576B1
公开(公告)日：2004-09-14
申请号：US09361496
申请日：1999-07-26
申请人： Boris Chidlovskii
发明人： Boris Chidlovskii
IPC分类号： G06F1500
CPC分类号： G06F17/30569
摘要： A method for generating a wrapper grammar for a file having a structure of a particular format includes providing at least one sample file of the particular format, where the particular format comprises a plurality of string tokens. Each sample file includes a plurality of tokens (data strings) which may be actual data from the document, an HTML tag or some other grammatical separator. The sample file of the particular format is then processed by annotating attributable tokens with a user-defined attribute, such as Author, Title, etc. from a set of attributes to form an annotated sample set. The annotated sample set is then evaluated to determine if wrapper grammar generation is possible, and if it is possible, a wrapper grammar for the files having a structure of the particular format is generated. Preferably, the annotated sample set is evaluated by determining if all attributes in the annotated sample set are distinguishable from one another.
摘要翻译：用于为具有特定格式的结构的文件生成包装器语法的方法包括提供特定格式的至少一个样本文件，其中特定格式包括多个字符串令牌。每个样本文件包括可以是来自文档的实际数据，HTML标签或其他语法分隔符的多个令牌（数据串）。然后通过从一组属性中用用户定义的属性（例如作者，标题等）注释可归属令牌来处理特定格式的样本文件，以形成注释样本集。然后评估注释样本集，以确定是否可能进行封装语法生成，并且如果可能，则生成具有特定格式的结构的文件的包装器语法。优选地，通过确定注释样本集中的所有属性是否可彼此区分来评估注释样本集。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式