会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 11. 发明申请
    • Method and System for Processing a Text Search Query in a Collection of Documents
    • 在文件集合中处理文本搜索查询的方法和系统
    • US20080091666A1
    • 2008-04-17
    • US11952627
    • 2007-12-07
    • Andrea BaaderJochen DoerreMonika MatschkeAndreas NeumannRoland Seiffert
    • Andrea BaaderJochen DoerreMonika MatschkeAndreas NeumannRoland Seiffert
    • G06F17/30
    • G06F17/30646Y10S707/99933Y10S707/99942Y10S707/99943
    • According to the present invention a method and an infrastructure are provided for processing a text search query in a collection of documents (100). Therefore, a full posting index (200) is generated, stored and updated for each document added to the collection (100). Said full posting index (200) comprising a set of index terms and a full posting list for each index term of said set, enumerating all occurrences of said index term in all documents of the collection (100). In addition to said full posting index (200) at least one additional posting index (400, 500, 600) is generated, stored and updated for each document added to the collection (100). Said additional posting index (400, 500, 600) is related to a defined document part and comprises a set of index terms and a restricted posting list for each index term of said set, enumerating all occurrences of said index term in said document part of all documents of the collection (100). A text search query comprises search conditions on search terms, which are translated into conditions on the index terms of said full posting index (200). Then, said translated conditions of a given text search query are optimized (a) by identifying all conditions of said translated conditions, which are restricted to defined document parts, for which an additional posting index is available, and (b) by re-writing said identified conditions with part restriction as pair conditions on index terms of said additional posting index (400, 500, 600) and the corresponding document part. Thus, said pair conditions can be processed by using only said additional posting index (400, 500, 600).
    • 根据本发明,提供一种用于在文档集合(100)中处理文本搜索查询的方法和基础设施。 因此,为添加到集合(100)的每个文档生成,存储和更新完整发布索引(200)。 所述完整发布索引(200)包括一组索引项和用于所述集合的每个索引项的完整发布列表,列举所述集合(100)的所有文档中的所有索引项的所有出现。 除了所述完整发布索引(200)之外,为添加到集合(100)的每个文档生成,存储和更新至少一个附加发布索引(400,500,600)。 所述附加发布索引(400,500,600)与定义的文档部分相关,并且包括一组索引项和针对所述集合的每个索引项的限制发布列表,列举在所述文档部分的所有文档部分中的所有索引项的所有出现 所有文件的收藏(100)。 文本搜索查询包括关于搜索词的搜索条件,其被转换成所述完整发布索引(200)的索引项的条件。 然后,对给定文本搜索查询的所述翻译条件进行优化(a)通过识别所述翻译条件的所有条件,所述条件限于定义的文档部分,附加的发布索引可用于其定义的文档部分,以及(b)通过重写 所述识别的条件具有部分限制,作为所述附加发布索引(400,500,600)的索引项和对应的文档部分的对条件。 因此,可以仅使用所述附加过帐索引(400,500,600)来处理所述对条件。