专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明申请

US20110040769A1 Query-URL N-Gram Features in Web Ranking 审中-公开
标题翻译： Web排名中的查询 - URL N-gram特征
公开(公告)号：US20110040769A1
公开(公告)日：2011-02-17
申请号：US12541063
申请日：2009-08-13
申请人： Huihsin Tseng , Longbin Chen , Yumao Lu , Fachun Peng
发明人： Huihsin Tseng , Longbin Chen , Yumao Lu , Fachun Peng
IPC分类号： G06F17/30
CPC分类号： G06F16/951
摘要： In one embodiment, access one or more pairs of search query and clicked Uniform Resource Locator (URL). For each of the pairs of search query and clicked URL, segment the search query into one or more query segments and the clicked URL into one or more URL segments; construct one or more query-URL n-grams, each of which comprises a query part comprising at least one of the query segments and a URL part comprising at least one of the URL segments; and calculate one or more association scores, each of which for one of the query-URL n-grams and represents a similarity between the query part and the URL part of the query-URL n-gram and is based on a first frequency of the query part and the URL part, a second frequency of the query part, and a third frequency of the URL part.
摘要翻译：在一个实施例中，访问一对或多对搜索查询和点击的统一资源定位符（URL）。对于每一对搜索查询和点击的URL，将搜索查询分割成一个或多个查询段，并将点击的URL分段成一个或多个URL段; 构造一个或多个查询URL n克，每个查询URL n-gram包括包括至少一个查询段的查询部分和包括至少一个URL段的URL部分; 并且计算一个或多个关联分数，其中每个关联分数中的每一个用于查询URL n-gram中的一个，并且表示查询部分与查询URL n-gram的URL部分之间的相似度，并且基于第一频率查询部分和URL部分，查询部分的第二个频率，以及URL部分的第三个频率。

2. 发明申请

US20100257150A1 QUERY IDENTIFICATION AND NORMALIZATION FOR WEB SEARCH 有权
标题翻译：网页搜索的查询和标准化
公开(公告)号：US20100257150A1
公开(公告)日：2010-10-07
申请号：US12818036
申请日：2010-06-17
申请人： Yumao Lu , Nawaaz Ahmed , Fuchun Peng , Marco Zagha
发明人： Yumao Lu , Nawaaz Ahmed , Fuchun Peng , Marco Zagha
IPC分类号： G06F17/30
CPC分类号： G06F17/30867 , G06F17/3064
摘要： A computer-implemented method for processing user entered query data to improve results of a search of pages using a local search database, when searching the internet, is disclosed. The method includes receiving the user entered query data and parsing each word of the query data and segmenting words using a probabilistic dictionary to determine a likelihood that the word is for a particular name. And, associating the particular names with a name tag to create one or more tagged name terms. Then, normalizing each of the tagged name terms and the normalizing including boosting information if found in the local search database and determining proximity between selected ones of the tagged name terms. The method then generates an optimized search query that incorporates normalized terms and operators. The optimized search query being applied to the internet to enable search results to be produced and displayed to the user in response to the entered query data.
摘要翻译：公开了一种用于处理用户输入的查询数据以便在搜索互联网时使用本地搜索数据库搜索页面的结果的计算机实现的方法。该方法包括接收用户输入的查询数据和解析查询数据的每个单词并使用概率词典分割单词，以确定单词对于特定名称的可能性。并且，将特定名称与名称标签相关联以创建一个或多个标记名称术语。然后，对每个标记的名称术语进行归一化，并且如果在本地搜索数据库中找到，则包括增强信息的归一化，并且确定所选标记的名称术语之间的接近度。该方法然后生成一个优化的搜索查询，其中包含标准化术语和运算符。优化的搜索查询被应用于互联网，以响应于输入的查询数据来产生和显示给用户的搜索结果。

3. 发明授权

US07693865B2 Techniques for navigational query identification 有权
标题翻译：导航查询识别技术
公开(公告)号：US07693865B2
公开(公告)日：2010-04-06
申请号：US11514076
申请日：2006-08-30
申请人： Yumao Lu , Fuchun Peng , Xin Li , Nawaaz Ahmed
发明人： Yumao Lu , Fuchun Peng , Xin Li , Nawaaz Ahmed
IPC分类号： G06F17/00
CPC分类号： G06K9/623 , G06F17/30707 , G06F17/30864 , G06K9/6278
摘要： To accurately classify a query as navigational, thousands of available features are explored, extracted from major commercial search engine results, user Web search click data, query log, and the whole Web's relational content. To obtain the most useful features for navigational query identification, a three level system is used which integrates feature generation, feature integration, and feature selection in a pipeline. Because feature selection plays a key role in classification methodologies, the best feature selection method is coupled with the best classification approach to achieve the best performance for identifying navigational queries. According to one embodiment, linear Support Vector Machine (SVM) is used to rank features and the top ranked features are fed into a Stochastic Gradient Boosting Tree (SGBT) classification method for identifying whether or not a particular query is a navigational query.
摘要翻译：为了将查询精确地分类为导航，从主要商业搜索引擎结果，用户Web搜索点击数据，查询日志和整个Web的关系内容中提取出数千种可用功能。为了获得导航查询识别最有用的功能，使用了一个三级系统，将特征生成，特征集成和特征选择集成在一条流水线中。因为特征选择在分类方法中起着关键作用，因此最好的特征选择方法与最佳分类方法相结合，以实现识别导航查询的最佳性能。根据一个实施例，使用线性支持向量机（SVM）对特征进行排序，并且将顶级特征馈送到用于识别特定查询是否是导航查询的随机渐变增强树（SGBT）分类方法中。

4. 发明申请

US20110264647A1 QUERY PROCESSING FOR WEB SEARCH 有权
标题翻译：用于WEB搜索的查询处理
公开(公告)号：US20110264647A1
公开(公告)日：2011-10-27
申请号：US13175797
申请日：2011-07-01
申请人： Yumao Lu , Nawaaz Ahmed , Fuchun Peng , Marco Zagha
发明人： Yumao Lu , Nawaaz Ahmed , Fuchun Peng , Marco Zagha
IPC分类号： G06F17/30
CPC分类号： G06F17/30867 , G06F17/3064
摘要： A computer-implemented method for processing user entered query data to improve results of a search of pages using a database, when searching the internet, is disclosed. The method includes receiving the user entered query data and parsing each word of the query data and segmenting words using probability to determine a likelihood that the word is for a particular name. And, associating the particular names with a name tag to create one or more tagged name terms. Then, normalizing each of the tagged name terms and the normalizing including boosting information if found in the database and determining proximity between selected ones of the tagged name terms. The method then generates an optimized search query that incorporates normalized terms and operators. The optimized search query being applied to the internet to enable search results to be produced and displayed to the user in response to the entered query data.
摘要翻译：公开了一种用于处理用户输入的查询数据以在搜索互联网时改进使用数据库的页面的搜索结果的计算机实现的方法。该方法包括接收用户输入的查询数据和解析查询数据的每个单词，并使用概率来分割单词以确定单词用于特定名称的可能性。并且，将特定名称与名称标签相关联以创建一个或多个标记名称术语。然后，对每个标记的名称术语进行归一化，并且如果在数据库中找到，则标准化包括增强信息，并确定所选择的标记名称术语之间的接近度。该方法然后生成一个优化的搜索查询，其中包含标准化术语和运算符。优化的搜索查询被应用于互联网，以响应于输入的查询数据来产生和显示给用户的搜索结果。

5. 发明授权

US07769746B2 Local query identification and normalization for web search 有权
标题翻译：网页搜索的本地查询识别和规范化
公开(公告)号：US07769746B2
公开(公告)日：2010-08-03
申请号：US12015448
申请日：2008-01-16
申请人： Yumao Lu , Nawaaz Ahmed , Fuchun Peng , Marco Zagha
发明人： Yumao Lu , Nawaaz Ahmed , Fuchun Peng , Marco Zagha
IPC分类号： G06F17/30
CPC分类号： G06F17/30867 , G06F17/3064
摘要： Computer-implemented methods and systems for processing user entered query data to improve results of a search of pages using a local search database are provided, when searching the internet. The method includes receiving the user entered query data and parsing each word of the query data and examining each word to determine if the word is associated with one of a business name, a city name or a state name. The examining uses probabilistic dictionaries to determine a likelihood that the word is for a particular term or intent. The method further includes normalizing each of the tagged business terms. The normalizing includes boosting information if found in the local search database and determining proximity between selected ones of the tagged terms. Then, generating an optimized internal search query that incorporates constraints and ranking based on at least the boosting information and the determined proximity between the selected tagged terms. The optimized internal search query is applied to the internet to enable search results to be produced and displayed to the user in response to the entered query data.
摘要翻译：当搜索互联网时，提供了用于处理用户输入的查询数据以改进使用本地搜索数据库的页面搜索结果的计算机实现的方法和系统。该方法包括接收用户输入的查询数据并解析查询数据的每个单词并检查每个单词以确定该单词是否与商务名称，城市名称或州名称之一相关联。检查使用概率词典来确定该词是针对特定术语或意图的可能性。该方法还包括标准化每个标记的业务项。归一化包括在本地搜索数据库中找到的增强信息，并确定所标记的条款中所选择的一个之间的接近度。然后，生成优化的内部搜索查询，该内部搜索查询至少基于提升信息和确定的所选标记项目之间的接近度而合并约束和排序。优化的内部搜索查询被应用于互联网，以便响应于输入的查询数据而产生并显示给用户的搜索结果。

6. 发明申请

US20090132515A1 Method and Apparatus for Performing Multi-Phase Ranking of Web Search Results by Re-Ranking Results Using Feature and Label Calibration 审中-公开
标题翻译：通过使用特征和标签校准重新排列结果来执行网页搜索结果的多阶段排序的方法和装置
公开(公告)号：US20090132515A1
公开(公告)日：2009-05-21
申请号：US11942410
申请日：2007-11-19
申请人： Yumao Lu , Fuchun Peng , Xin Li , Nawaaz Ahmed
发明人： Yumao Lu , Fuchun Peng , Xin Li , Nawaaz Ahmed
IPC分类号： G06F17/30
CPC分类号： G06F16/951 , G06F16/335
摘要： A method and apparatus for performing multi-phase ranking of web search results by re-ranking results using feature and label calibration are provided. According to one embodiment of the invention, a ranking function is trained by using machine learning techniques on a set of training samples to produce ranking scores. The ranking function is used to rank the set of training samples according to its ranking score, in order of its relevance to a particular query. Next, a re-ranking function is trained by the same training samples to re-rank the documents from the first ranking. The features and labels of the training samples are calibrated and normalized before they are reused to train the re-ranking function. By this method, training data and training features used in past trainings are leveraged to perform additional training of new functions, without requiring the use of additional training data or features.
摘要翻译：提供了一种通过使用特征和标签校准重新排列结果来执行网络搜索结果的多阶段排序的方法和装置。根据本发明的一个实施例，通过在一组训练样本上使用机器学习技术来训练排名功能以产生排名分数。排序函数用于根据其与特定查询的相关性，根据其排名得分对训练样本集进行排序。接下来，通过相同的训练样本来训练重新排序功能以从第一等级重新排列文档。培训样本的特征和标签在重新使用之前进行校准和归一化，以训练重新排序功能。通过这种方法，可以利用过去培训中使用的训练数据和训练特征来执行新功能的附加训练，而不需要使用额外的训练数据或特征。

7. 发明授权

US08112436B2 Semantic and text matching techniques for network search 有权
标题翻译：网络搜索的语义和文本匹配技术
公开(公告)号：US08112436B2
公开(公告)日：2012-02-07
申请号：US12563357
申请日：2009-09-21
申请人： Yumao Lu , Lei Duan , Fan Li , Benoit Dumoulin , Xing Wei
发明人： Yumao Lu , Lei Duan , Fan Li , Benoit Dumoulin , Xing Wei
IPC分类号： G06F17/30
CPC分类号： G06F17/30864
摘要： In one embodiment, access a search query comprising one or more query words, at least one of the query words representing one or more query concepts; access a network document identified for a search query by a search engine, the network document comprising one or more document words, at least one of the document words representing one or more document concepts; semantic-text match the search query and the network document to determine one or more negative semantic-text matches; and construct one or more negative features based on the negative semantic-text matches.
摘要翻译：在一个实施例中，访问包括一个或多个查询词的搜索查询，表示一个或多个查询概念的查询词中的至少一个; 访问由搜索引擎识别为搜索查询的网络文档，所述网络文档包括一个或多个文档字，所述文档字中的至少一个表示一个或多个文档概念; 语义文本匹配搜索查询和网络文档以确定一个或多个否定语义文本匹配; 并基于负面语义文本匹配构造一个或多个负面特征。

8. 发明申请

US20110072023A1 Detect, Index, and Retrieve Term-Group Attributes for Network Search 审中-公开
标题翻译：检测，索引和检索网络搜索的术语组属性
公开(公告)号：US20110072023A1
公开(公告)日：2011-03-24
申请号：US12563347
申请日：2009-09-21
申请人： Yumao Lu
发明人： Yumao Lu
IPC分类号： G06F17/30
CPC分类号： G06F16/951 , G06F16/313 , G06F16/353
摘要： In one embodiment, concept tag a network document comprising document words based on a set of document concepts, each of the document words being indexed with its position within the network document, such that for each of the document words, if the document word represents one of the document concepts, index a document concept tag corresponding to the one document concept with the position of the document word within the network document. Concept tag a search query based on a set of query concepts by associating appropriate query concept tags with selected query words. For each of the query words associated with the query concept tags, determine zero or more first positions within the network document at which the document words match the query word or its synonym and zero or more second positions within the network document at which the document concept tags correspond to the query concept tag.
摘要翻译：在一个实施例中，概念标记包括基于一组文档概念的文档单词的网络文档，每个文档单词以其在网络文档内的位置被索引，使得对于每个文档单词，如果文档单词表示一个的文档概念，将与文档概念相对应的文档概念标签与网络文档中的文档字的位置进行索引。概念通过将适当的查询概念标签与选定的查询字相关联，基于一组查询概念来标记搜索查询。对于与查询概念标签相关联的每个查询词，确定网络文档中的零个或多个第一位置，在该文档中文档字符与查询词或其同义词匹配，并且在网络文档内的零个或更多个第二位置处，文档概念标签对应于查询概念标签。

9. 发明申请

US20090234836A1 MULTI-TERM SEARCH RESULT WITH UNSUPERVISED QUERY SEGMENTATION METHOD AND APPARATUS 审中-公开
标题翻译：多重搜索结果与不确定的查询分段方法和设备
公开(公告)号：US20090234836A1
公开(公告)日：2009-09-17
申请号：US12048715
申请日：2008-03-14
申请人： Fuchun Peng , Yumao Lu , Nawaaz Ahmed , Bin Tan
发明人： Fuchun Peng , Yumao Lu , Nawaaz Ahmed , Bin Tan
IPC分类号： G06F7/06
CPC分类号： G06F16/313
摘要： Generally, a method and apparatus provides for search results in response to a web search request having at least two search terms in the search request. The method and apparatus includes generating a plurality of term groupings of the search terms and determining a relevance factor for each of the term groupings. The method and apparatus further determines a set of the term groupings based on the relevance factors and therein conducts a web resource search using the set of term groupings, to thereby generate search results. The method and apparatus provides the search results to the requesting entity.
摘要翻译：通常，方法和装置响应于在搜索请求中具有至少两个搜索项的网络搜索请求来提供搜索结果。该方法和装置包括产生搜索项的多个项目分组并确定每个术语分组的相关因子。方法和装置还基于相关性因素进一步确定一组术语分组，并且其中使用该组术语分组进行网络资源搜索，从而生成搜索结果。该方法和装置向请求实体提供搜索结果。

10. 发明申请

US20080189262A1 Word pluralization handling in query for web search 有权
标题翻译：在Web搜索查询中的Word复数处理
公开(公告)号：US20080189262A1
公开(公告)日：2008-08-07
申请号：US11701736
申请日：2007-02-01
申请人： Fuchun Peng , Nawaaz Ahmed , Xin Li , Yumao Lu
发明人： Fuchun Peng , Nawaaz Ahmed , Xin Li , Yumao Lu
IPC分类号： G06F17/30
CPC分类号： G06F17/30864
摘要： Techniques for determining when and how to transform words in a query to its plural or non-plural form in order to provide the most relevant search results while minimizing computational overhead are provided. A dictionary is generated based upon the words used in a specified number of previous most frequent search queries and comprises lists of transformations from plural to singular and singular to plural. Unnecessary transformations are removed from the dictionary based upon language modeling. The word to transform is determined by finding the last non-stop re-writable word of the query. The context of the transformed word is confirmed in the search documents and a version of the query is executed using both the original form of the word and the transformation of the word.
摘要翻译：提供了用于确定何时以及如何将查询中的单词转换为多个或非复数形式的技术，以便在最小化计算开销的同时提供最相关的搜索结果。基于在指定数量的先前最频繁的搜索查询中使用的词来生成字典，并且包括从多个到单数和单数到多个的变换的列表。基于语言建模，从字典中删除不必要的转换。要转换的词是通过查找查询的最后一个不间断的可重写词来确定的。在搜索文档中确认转换词的上下文，并且使用单词的原始形式和单词的转换来执行查询的版本。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式