会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明授权
    • Duplicate entry detection system and method
    • 重复条目检测系统和方法
    • US08046372B1
    • 2011-10-25
    • US11754237
    • 2007-05-25
    • Srikanth ThirumalaiAswath ManoharanMark J. TomkoGrant M. EmeryVijai MohanEgidio Terra
    • Srikanth ThirumalaiAswath ManoharanMark J. TomkoGrant M. EmeryVijai MohanEgidio Terra
    • G06F7/00G06F17/30
    • G06F17/30616
    • A computer system and method for determining whether the subject matter described in a received document is substantially similar to the subject matter of other documents in a document corpus, such that the received document can be considered a duplicate document. After receiving a first document, a set of tokens for the first document is generated. A non-fielded relevance search on a token index is executed. The relevance search returns a set of candidate duplicate documents with scores corresponding to each candidate document. For each candidate document with a score above a threshold, filtering is performed on each candidate document to determine whether each candidate document is a true duplicate of the first document. A set of candidate documents with a score above the threshold that were not disqualified as candidate documents is then provided.
    • 一种计算机系统和方法,用于确定在接收到的文档中描述的主题与文档语料库中的其他文档的主题是否基本相似,使得所接收的文档可以被认为是重复的文档。 在收到第一个文档之后,生成第一个文档的一组令牌。 执行令牌索引上的非字段相关搜索。 相关性搜索返回一组具有与每个候选文档相对应的分数的候选重复文档。 对于分数高于阈值的每个候选文档,对每个候选文档进行过滤以确定每个候选文档是否是第一个文档的真实副本。 然后提供一组具有不超过门槛的分数的候选文件,不被取消作为候选文件的资格。
    • 5. 发明授权
    • Comparison engine for identifying documents describing similar subject matter
    • 用于识别描述相似主题的文档的比较引擎
    • US07904462B1
    • 2011-03-08
    • US11953726
    • 2007-12-10
    • Srikanth ThirumalaiAswath ManoharanMark J. TomkoGrant M. EmeryVijai MohanEgidio Terra
    • Srikanth ThirumalaiAswath ManoharanMark J. TomkoGrant M. EmeryVijai MohanEgidio Terra
    • G06F7/00G06F17/00
    • G06Q30/06
    • Systems and methods for determining whether a first document is a potential duplicate of a second document such that the two documents describe the same or substantially the same subject matter, wherein the first and second documents include attribute data in attribute fields. A set of rules is obtained for determining whether the first document is a potential duplicate of the second document. Moreover, for each rule in the set of rules, a determination is made as to whether data in a first set of attributes of the first document is contained in a second set of attributes of the second document. According to the results of the evaluated rules in the rules set, determining whether the first document is a potential duplicate of the second document. If, according to the evaluated rules in the rules set, the first document is determined to be a potential duplicate of the second document, storing a reference to the first document in a set of potential duplicates of the second document.
    • 用于确定第一文档是否是第二文档的潜在副本的系统和方法,使得两个文档描述相同或基本相同的主题,其中第一和第二文档包括属性字段中的属性数据。 获得一组用于确定第一文档是否是第二文档的潜在副本的规则。 此外,对于该组规则中的每个规则,确定第一文档的第一组属性中的数据是否包含在第二文档的第二组属性中。 根据规则集中评估规则的结果,确定第一个文档是否是第二个文档的潜在副本。 如果根据规则集中的评估规则,确定第一文档是第二文档的潜在副本,则将第一文档的引用存储在第二文档的一组潜在重复项中。