会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明申请
    • INFORMATION RETRIEVAL AND TEXT MINING USING DISTRIBUTED LATENT SEMANTIC INDEXING
    • 信息检索和文本挖掘使用分布式的专有语义索引
    • WO2004100130A2
    • 2004-11-18
    • PCT/US2004/012462
    • 2004-04-23
    • TELCORDIA TECHNOLOGIES, INC.
    • BEHRENS, Clifford, A.BASSU, Devasis
    • G11B
    • G06F17/3071Y10S707/99935Y10S707/99943
    • The use of latent semantic indexing (LSI) for information retrieval and text mining operations is adapted to work on large heterogeneous data sets by first partitioning the data set into a number of smaller partitions having similar concept domains. A similarity graph network is generated in order to expose links between concept domains which are then exploited in determining which domains to query as well as in expanding the query vector. LSI is performed on those partitioned data sets most likely to contain information related to the user query or text mining operation. In this manner LSI can be applied to datasets that heretofore presented scalability problems. Additionally, the computation of the singular value decomposition of the term-by-document matrix can be accomplished at various distributed computers increasing the robustness of the retrieval and text mining system while decreasing search times.
    • 利用潜在语义索引(LSI)进行信息检索和文本挖掘操作适用于大型异构数据集,首先将数据集划分为具有类似概念域的多个较小分区。 生成相似图网络以便公开概念域之间的链接,然后在确定要查询的域以及扩展查询向量时利用这些链接。 对最可能包含与用户查询或文本挖掘操作相关的信息的那些分区数据集执行LSI。 以这种方式,LSI可以应用于迄今为止呈现的可扩展性问题的数据集。 另外,逐个文档矩阵的奇异值分解的计算可以在各种分布式计算机上实现,从而增加检索和文本挖掘系统的鲁棒性,同时减少搜索时间。