会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 64. 发明授权
    • METHOD AND SYSTEM FOR SIMILARITY SEARCH AND CLUSTERING
    • 方法和装置并行搜索和教育集团
    • EP1459206B1
    • 2007-07-11
    • EP02773177.7
    • 2002-08-09
    • Endeca Technologies, Inc.
    • TUNKELANG, Daniel
    • G06F17/30
    • G06F17/3071G06F17/30477
    • Provided is a similarity search method that makes use of a localized distance metric. The data includes a collection of items, wherein each item is associated with a set of properties. The distance between two items is defined in terms of the number of items in the collection that are associated with the set of properties common to the two items. A query is generally composed of a set of properties. The distance between a query and an item is defined in terms of the number of items in the collection that are associated with the set of properties common to the query and the item. The properties can be of various types, such as binary, partially ordered, or numeric. The distance metric may be applied explicitly or implicitly for similarity search. One embodiment of this invention uses random walks such that the similarity search can be performed exactly or approximately, trading-off between accuracy and performance. The distance metric of the present invention can also be the basis for matching and clustering applications. In these contexts, the distance metric of the present invention may be used to build a graph, to which matching or clustering algorithms can be applied.
    • 69. 发明公开
    • METHOD AND SYSTEM FOR SIMILARITY SEARCH AND CLUSTERING
    • 方法和装置并行搜索和教育集团
    • EP1459206A1
    • 2004-09-22
    • EP02773177.7
    • 2002-08-09
    • Endeca Technologies, Inc.
    • TUNKELANG, Daniel
    • G06F17/30
    • G06F17/3071G06F17/30477
    • Provided is a similarity search method that makes use of a localized distance metric. The data includes a collection of items, wherein each item is associated with a set of properties. The distance between two items is defined in terms of the number of items in the collection that are associated with the set of properties common to the two items. A query is generally composed of a set of properties. The distance between a query and an item is defined in terms of the number of items in the collection that are associated with the set of properties common to the query and the item. The properties can be of various types, such as binary, partially ordered, or numeric. The distance metric may be applied explicitly or implicitly for similarity search. One embodiment of this invention uses random walks such that the similarity search can be performed exactly or approximately, trading-off between accuracy and performance. The distance metric of the present invention can also be the basis for matching and clustering applications. In these contexts, the distance metric of the present invention may be used to build a graph, to which matching or clustering algorithms can be applied.