专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US08359282B2 Supervised semantic indexing and its extensions 有权
标题翻译：监督语义索引及其扩展
公开(公告)号：US08359282B2
公开(公告)日：2013-01-22
申请号：US12562840
申请日：2009-09-18
申请人： Bing Bai , Jason Weston , Ronan Collorbert , David Grangier
发明人： Bing Bai , Jason Weston , Ronan Collorbert , David Grangier
IPC分类号： G06F15/18 , G06F7/00
CPC分类号： G06F17/30663 , G06F17/30616
摘要： A system and method for determining a similarity between a document and a query includes providing a frequently used dictionary and an infrequently used dictionary in storage memory. For each word or gram in the infrequently used dictionary, n words or grams are correlated from the frequently used dictionary based on a first score. Features for a vector of the infrequently used words or grams are replaced with features from a vector of the correlated words or grams from the frequently used dictionary when the features from a vector of the correlated words or grams meet a threshold value. A similarity score is determined between weight vectors of a query and one or more documents in a corpus by employing the features from the vector of the correlated words or grams that met the threshold value.
摘要翻译：用于确定文档和查询之间的相似性的系统和方法包括在存储存储器中提供频繁使用的字典和不经常使用的字典。对于不经常使用的字典中的每个单词或克，n个词或克根据第一个分数与经常使用的词典相关联。当相关词或克的向量的特征符合阈值时，不经常使用的单词或克的向量的特征将被来自经常使用的词典的相关词或克的向量的特征替换。通过使用满足阈值的相关词或克的向量的特征，在查询的权重向量和语料库中的一个或多个文档之间确定相似性得分。

2. 发明申请

US20100185659A1 SUPERVISED SEMANTIC INDEXING AND ITS EXTENSIONS 有权
标题翻译：监督语义索引及其扩展
公开(公告)号：US20100185659A1
公开(公告)日：2010-07-22
申请号：US12562840
申请日：2009-09-18
申请人： BING BAI , JASON WESTON , RONAN COLLORBERT , DAVID GRANGIER
发明人： BING BAI , JASON WESTON , RONAN COLLORBERT , DAVID GRANGIER
IPC分类号： G06F17/30
CPC分类号： G06F17/30663 , G06F17/30616
摘要： A system and method for determining a similarity between a document and a query includes providing a frequently used dictionary and an infrequently used dictionary in storage memory. For each word or gram in the infrequently used dictionary, n words or grams are correlated from the frequently used dictionary based on a first score. Features for a vector of the infrequently used words or grams are replaced with features from a vector of the correlated words or grams from the frequently used dictionary when the features from a vector of the correlated words or grams meet a threshold value. A similarity score is determined between weight vectors of a query and one or more documents in a corpus by employing the features from the vector of the correlated words or grams that met the threshold value.
摘要翻译：用于确定文档和查询之间的相似性的系统和方法包括在存储存储器中提供频繁使用的字典和不经常使用的字典。对于不经常使用的字典中的每个单词或克，n个词或克根据第一个分数与经常使用的词典相关联。当相关词或克的向量的特征符合阈值时，不经常使用的单词或克的向量的特征将被来自经常使用的词典的相关词或克的向量的特征替换。通过使用满足阈值的相关词或克的向量的特征，在查询的权重向量和语料库中的一个或多个文档之间确定相似性得分。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式