![SEMANTIC ENRICHMENT BY EXPLOITING TOP-K PROCESSING](/abs-image/US/2013/10/10/US20130268261A1/abs.jpg.150x150.jpg)
基本信息:
- 专利标题: SEMANTIC ENRICHMENT BY EXPLOITING TOP-K PROCESSING
- 专利标题(中):通过开发TOP-K处理的语义丰富
- 申请号:US13701347 申请日:2011-06-03
- 公开(公告)号:US20130268261A1 公开(公告)日:2013-10-10
- 发明人: Jong Wook Kim , Ashwin S. Kashyap , Dekai Li , Sandilya Bhamidipati , Avinash Sridhar , Saurabh Mathur , Bankim A. Patel
- 申请人: Jong Wook Kim , Ashwin S. Kashyap , Dekai Li , Sandilya Bhamidipati , Avinash Sridhar , Saurabh Mathur , Bankim A. Patel
- 申请人地址: FR Issy de Moulineaux
- 专利权人: THOMSON LICENSING
- 当前专利权人: THOMSON LICENSING
- 当前专利权人地址: FR Issy de Moulineaux
- 国际申请: PCT/US11/38991 WO 20110603
- 主分类号: G06F17/27
- IPC分类号: G06F17/27
摘要:
Proper representation of the meaning of texts is crucial to enhancing many data mining and information retrieval tasks, including clustering, computing semantic relatedness between texts, and searching. Representing of texts in the concept-space derived from Wikipedia has received growing attention recently, due to its comprehensiveness and expertise. This concept-based representation is capable of extracting semantic relatedness between texts that cannot be deduced with the bag of words model. A key obstacle, however, for using Wikipedia as a semantic interpreter is that the sheer size of the concepts derived from Wikipedia makes it hard to efficiently map texts into concept-space. An efficient algorithm is proved which is able to represent the meaning of a text by using the concepts that best match it. In particular, this approach first computes the approximate top- concepts that are most relevant to the given text. These concepts are then leverage to represent the meaning of the given text.
摘要(中):
正确表达文本的意义对于增强许多数据挖掘和信息检索任务至关重要,包括聚类,计算文本之间的语义相关性和搜索。 由于其全面性和专业性,代表维基百科的概念空间中的文本受到越来越多的关注。 这种基于概念的表示能够提取不能用单词模型推导的文本之间的语义相关性。 然而,使用维基百科作为语义翻译的一个主要障碍是,维基百科的概念的绝对大小使得难以有效地将文本映射到概念空间中。 证明了一种有效的算法,它能够通过使用与之匹配的概念来表示文本的含义。 特别地,这种方法首先计算与给定文本最相关的近似概念。 这些概念然后被用来表示给定文本的含义。