会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明申请
    • TERM-STATISTICS MODIFICATION FOR CATEGORY-BASED SEARCH
    • 基于类别搜索的TERM-STATISTICS MODIFICATION
    • US20090307209A1
    • 2009-12-10
    • US12136069
    • 2008-06-10
    • David CARMELAdam DARLOWYael PETRUSCHKAAya SOFFER
    • David CARMELAdam DARLOWYael PETRUSCHKAAya SOFFER
    • G06F7/06G06F17/30
    • G06F16/3347
    • An apparatus for searching a document collection is provided. The apparatus includes a memory, which is arranged to store a plurality of documents that are respectively associated with one or more categories and contain terms, a search processor, which is arranged to provide an index of the terms indicating the documents in which the terms appear, to estimate a first statistical distribution of each of at least some of the terms in the index over the documents in the collection, to estimate a second statistical distribution of each of at least some of the categories over the documents in the collection, to accept a query comprising one or more of the terms and a specified category restriction referring to at least one of the categories, to compute a local term distribution, which is indicative of occurrence frequencies of at least one of the terms in the query within the specified category restriction, using the first and second estimated statistical distributions to determine a category-specific score for the at least one of the terms responsively to the local term distribution within the specified category restriction, and to apply the query to the index using the category-specific score so as to return a response, wherein the processor is arranged to construct term histograms of the at least some of the terms in the index, to construct category histograms of the at least some of the categories, and to map the documents in the collection to bins of the histograms, so as to estimate the first and second statistical distributions, and wherein the processor is arranged to determine a category restriction histogram based on the category histogram of the at least one of the categories responsively to the category restriction, and to multiply the category restriction histogram by the term histogram of the at least one of the terms in the query so as to produce a localized term histogram.
    • 提供了一种用于搜索文档收集的装置。 该设备包括存储器,其被设置为存储分别与一个或多个类别相关联并包含术语的多个文档,搜索处理器被设置为提供指示术语出现的文档的术语的索引 估计收集中的文件中索引中的至少一些术语中的每一个的第一统计分布,以估计集合中的文档中的至少一些类别中的每一个的第二统计分布,以接受 包括一个或多个术语的查询以及指定类别中的至少一个的指定类别限制,以计算本地术语分布,其指示在指定类别内的查询中的至少一个术语的出现频率 限制,使用所述第一和第二估计统计分布来确定所述至少一个所述术语响应的类别特定分数 根据指定的类别限制中的本地术语分布,并且使用特定分数将查询应用于索引以便返回响应,其中处理器被布置为构造至少一些术语的术语直方图 在所述索引中,构建所述类别中的至少一些类别的类别直方图,并且将所述集合中的文档映射到所述直方图的分组,以便估计所述第一和第二统计分布,并且其中所述处理器被布置为确定 基于响应于类别限制的类别中的至少一个类别的类别直方图的类别限制直方图,并且将类别限制直方图乘以查询中的至少一个项的项直方图,以产生一个 局部术语直方图。