会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明申请
    • Query-URL N-Gram Features in Web Ranking
    • Web排名中的查询 - URL N-gram特征
    • US20110040769A1
    • 2011-02-17
    • US12541063
    • 2009-08-13
    • Huihsin TsengLongbin ChenYumao LuFachun Peng
    • Huihsin TsengLongbin ChenYumao LuFachun Peng
    • G06F17/30
    • G06F16/951
    • In one embodiment, access one or more pairs of search query and clicked Uniform Resource Locator (URL). For each of the pairs of search query and clicked URL, segment the search query into one or more query segments and the clicked URL into one or more URL segments; construct one or more query-URL n-grams, each of which comprises a query part comprising at least one of the query segments and a URL part comprising at least one of the URL segments; and calculate one or more association scores, each of which for one of the query-URL n-grams and represents a similarity between the query part and the URL part of the query-URL n-gram and is based on a first frequency of the query part and the URL part, a second frequency of the query part, and a third frequency of the URL part.
    • 在一个实施例中,访问一对或多对搜索查询和点击的统一资源定位符(URL)。 对于每一对搜索查询和点击的URL,将搜索查询分割成一个或多个查询段,并将点击的URL分段成一个或多个URL段; 构造一个或多个查询URL n克,每个查询URL n-gram包括包括至少一个查询段的查询部分和包括至少一个URL段的URL部分; 并且计算一个或多个关联分数,其中每个关联分数中的每一个用于查询URL n-gram中的一个,并且表示查询部分与查询URL n-gram的URL部分之间的相似度,并且基于第一频率 查询部分和URL部分,查询部分的第二个频率,以及URL部分的第三个频率。
    • 2. 发明申请
    • QUERY IDENTIFICATION AND NORMALIZATION FOR WEB SEARCH
    • 网页搜索的查询和标准化
    • US20100257150A1
    • 2010-10-07
    • US12818036
    • 2010-06-17
    • Yumao LuNawaaz AhmedFuchun PengMarco Zagha
    • Yumao LuNawaaz AhmedFuchun PengMarco Zagha
    • G06F17/30
    • G06F17/30867G06F17/3064
    • A computer-implemented method for processing user entered query data to improve results of a search of pages using a local search database, when searching the internet, is disclosed. The method includes receiving the user entered query data and parsing each word of the query data and segmenting words using a probabilistic dictionary to determine a likelihood that the word is for a particular name. And, associating the particular names with a name tag to create one or more tagged name terms. Then, normalizing each of the tagged name terms and the normalizing including boosting information if found in the local search database and determining proximity between selected ones of the tagged name terms. The method then generates an optimized search query that incorporates normalized terms and operators. The optimized search query being applied to the internet to enable search results to be produced and displayed to the user in response to the entered query data.
    • 公开了一种用于处理用户输入的查询数据以便在搜索互联网时使用本地搜索数据库搜索页面的结果的计算机实现的方法。 该方法包括接收用户输入的查询数据和解析查询数据的每个单词并使用概率词典分割单词,以确定单词对于特定名称的可能性。 并且,将特定名称与名称标签相关联以创建一个或多个标记名称术语。 然后,对每个标记的名称术语进行归一化,并且如果在本地搜索数据库中找到,则包括增强信息的归一化,并且确定所选标记的名称术语之间的接近度。 该方法然后生成一个优化的搜索查询,其中包含标准化术语和运算符。 优化的搜索查询被应用于互联网,以响应于输入的查询数据来产生和显示给用户的搜索结果。
    • 3. 发明授权
    • Techniques for navigational query identification
    • 导航查询识别技术
    • US07693865B2
    • 2010-04-06
    • US11514076
    • 2006-08-30
    • Yumao LuFuchun PengXin LiNawaaz Ahmed
    • Yumao LuFuchun PengXin LiNawaaz Ahmed
    • G06F17/00
    • G06K9/623G06F17/30707G06F17/30864G06K9/6278
    • To accurately classify a query as navigational, thousands of available features are explored, extracted from major commercial search engine results, user Web search click data, query log, and the whole Web's relational content. To obtain the most useful features for navigational query identification, a three level system is used which integrates feature generation, feature integration, and feature selection in a pipeline. Because feature selection plays a key role in classification methodologies, the best feature selection method is coupled with the best classification approach to achieve the best performance for identifying navigational queries. According to one embodiment, linear Support Vector Machine (SVM) is used to rank features and the top ranked features are fed into a Stochastic Gradient Boosting Tree (SGBT) classification method for identifying whether or not a particular query is a navigational query.
    • 为了将查询精确地分类为导航,从主要商业搜索引擎结果,用户Web搜索点击数据,查询日志和整个Web的关系内容中提取出数千种可用功能。 为了获得导航查询识别最有用的功能,使用了一个三级系统,将特征生成,特征集成和特征选择集成在一条流水线中。 因为特征选择在分类方法中起着关键作用,因此最好的特征选择方法与最佳分类方法相结合,以实现识别导航查询的最佳性能。 根据一个实施例,使用线性支持向量机(SVM)对特征进行排序,并且将顶级特征馈送到用于识别特定查询是否是导航查询的随机渐变增强树(SGBT)分类方法中。
    • 4. 发明申请
    • QUERY PROCESSING FOR WEB SEARCH
    • 用于WEB搜索的查询处理
    • US20110264647A1
    • 2011-10-27
    • US13175797
    • 2011-07-01
    • Yumao LuNawaaz AhmedFuchun PengMarco Zagha
    • Yumao LuNawaaz AhmedFuchun PengMarco Zagha
    • G06F17/30
    • G06F17/30867G06F17/3064
    • A computer-implemented method for processing user entered query data to improve results of a search of pages using a database, when searching the internet, is disclosed. The method includes receiving the user entered query data and parsing each word of the query data and segmenting words using probability to determine a likelihood that the word is for a particular name. And, associating the particular names with a name tag to create one or more tagged name terms. Then, normalizing each of the tagged name terms and the normalizing including boosting information if found in the database and determining proximity between selected ones of the tagged name terms. The method then generates an optimized search query that incorporates normalized terms and operators. The optimized search query being applied to the internet to enable search results to be produced and displayed to the user in response to the entered query data.
    • 公开了一种用于处理用户输入的查询数据以在搜索互联网时改进使用数据库的页面的搜索结果的计算机实现的方法。 该方法包括接收用户输入的查询数据和解析查询数据的每个单词,并使用概率来分割单词以确定单词用于特定名称的可能性。 并且,将特定名称与名称标签相关联以创建一个或多个标记名称术语。 然后,对每个标记的名称术语进行归一化,并且如果在数据库中找到,则标准化包括增强信息,并确定所选择的标记名称术语之间的接近度。 该方法然后生成一个优化的搜索查询,其中包含标准化术语和运算符。 优化的搜索查询被应用于互联网,以响应于输入的查询数据来产生和显示给用户的搜索结果。
    • 5. 发明授权
    • Local query identification and normalization for web search
    • 网页搜索的本地查询识别和规范化
    • US07769746B2
    • 2010-08-03
    • US12015448
    • 2008-01-16
    • Yumao LuNawaaz AhmedFuchun PengMarco Zagha
    • Yumao LuNawaaz AhmedFuchun PengMarco Zagha
    • G06F17/30
    • G06F17/30867G06F17/3064
    • Computer-implemented methods and systems for processing user entered query data to improve results of a search of pages using a local search database are provided, when searching the internet. The method includes receiving the user entered query data and parsing each word of the query data and examining each word to determine if the word is associated with one of a business name, a city name or a state name. The examining uses probabilistic dictionaries to determine a likelihood that the word is for a particular term or intent. The method further includes normalizing each of the tagged business terms. The normalizing includes boosting information if found in the local search database and determining proximity between selected ones of the tagged terms. Then, generating an optimized internal search query that incorporates constraints and ranking based on at least the boosting information and the determined proximity between the selected tagged terms. The optimized internal search query is applied to the internet to enable search results to be produced and displayed to the user in response to the entered query data.
    • 当搜索互联网时,提供了用于处理用户输入的查询数据以改进使用本地搜索数据库的页面搜索结果的计算机实现的方法和系统。 该方法包括接收用户输入的查询数据并解析查询数据的每个单词并检查每个单词以确定该单词是否与商务名称,城市名称或州名称之一相关联。 检查使用概率词典来确定该词是针对特定术语或意图的可能性。 该方法还包括标准化每个标记的业务项。 归一化包括在本地搜索数据库中找到的增强信息,并确定所标记的条款中所选择的一个之间的接近度。 然后,生成优化的内部搜索查询,该内部搜索查询至少基于提升信息和确定的所选标记项目之间的接近度而合并约束和排序。 优化的内部搜索查询被应用于互联网,以便响应于输入的查询数据而产生并显示给用户的搜索结果。
    • 6. 发明申请
    • Method and Apparatus for Performing Multi-Phase Ranking of Web Search Results by Re-Ranking Results Using Feature and Label Calibration
    • 通过使用特征和标签校准重新排列结果来执行网页搜索结果的多阶段排序的方法和装置
    • US20090132515A1
    • 2009-05-21
    • US11942410
    • 2007-11-19
    • Yumao LuFuchun PengXin LiNawaaz Ahmed
    • Yumao LuFuchun PengXin LiNawaaz Ahmed
    • G06F17/30
    • G06F16/951G06F16/335
    • A method and apparatus for performing multi-phase ranking of web search results by re-ranking results using feature and label calibration are provided. According to one embodiment of the invention, a ranking function is trained by using machine learning techniques on a set of training samples to produce ranking scores. The ranking function is used to rank the set of training samples according to its ranking score, in order of its relevance to a particular query. Next, a re-ranking function is trained by the same training samples to re-rank the documents from the first ranking. The features and labels of the training samples are calibrated and normalized before they are reused to train the re-ranking function. By this method, training data and training features used in past trainings are leveraged to perform additional training of new functions, without requiring the use of additional training data or features.
    • 提供了一种通过使用特征和标签校准重新排列结果来执行网络搜索结果的多阶段排序的方法和装置。 根据本发明的一个实施例,通过在一组训练样本上使用机器学习技术来训练排名功能以产生排名分数。 排序函数用于根据其与特定查询的相关性,根据其排名得分对训练样本集进行排序。 接下来,通过相同的训练样本来训练重新排序功能以从第一等级重新排列文档。 培训样本的特征和标签在重新使用之前进行校准和归一化,以训练重新排序功能。 通过这种方法,可以利用过去培训中使用的训练数据和训练特征来执行新功能的附加训练,而不需要使用额外的训练数据或特征。
    • 8. 发明申请
    • Detect, Index, and Retrieve Term-Group Attributes for Network Search
    • 检测,索引和检索网络搜索的术语组属性
    • US20110072023A1
    • 2011-03-24
    • US12563347
    • 2009-09-21
    • Yumao Lu
    • Yumao Lu
    • G06F17/30
    • G06F16/951G06F16/313G06F16/353
    • In one embodiment, concept tag a network document comprising document words based on a set of document concepts, each of the document words being indexed with its position within the network document, such that for each of the document words, if the document word represents one of the document concepts, index a document concept tag corresponding to the one document concept with the position of the document word within the network document. Concept tag a search query based on a set of query concepts by associating appropriate query concept tags with selected query words. For each of the query words associated with the query concept tags, determine zero or more first positions within the network document at which the document words match the query word or its synonym and zero or more second positions within the network document at which the document concept tags correspond to the query concept tag.
    • 在一个实施例中,概念标记包括基于一组文档概念的文档单词的网络文档,每个文档单词以其在网络文档内的位置被索引,使得对于每个文档单词,如果文档单词表示一个 的文档概念,将与文档概念相对应的文档概念标签与网络文档中的文档字的位置进行索引。 概念通过将适当的查询概念标签与选定的查询字相关联,基于一组查询概念来标记搜索查询。 对于与查询概念标签相关联的每个查询词,确定网络文档中的零个或多个第一位置,在该文档中文档字符与查询词或其同义词匹配,并且在网络文档内的零个或更多个第二位置处,文档概念 标签对应于查询概念标签。
    • 10. 发明申请
    • Word pluralization handling in query for web search
    • 在Web搜索查询中的Word复数处理
    • US20080189262A1
    • 2008-08-07
    • US11701736
    • 2007-02-01
    • Fuchun PengNawaaz AhmedXin LiYumao Lu
    • Fuchun PengNawaaz AhmedXin LiYumao Lu
    • G06F17/30
    • G06F17/30864
    • Techniques for determining when and how to transform words in a query to its plural or non-plural form in order to provide the most relevant search results while minimizing computational overhead are provided. A dictionary is generated based upon the words used in a specified number of previous most frequent search queries and comprises lists of transformations from plural to singular and singular to plural. Unnecessary transformations are removed from the dictionary based upon language modeling. The word to transform is determined by finding the last non-stop re-writable word of the query. The context of the transformed word is confirmed in the search documents and a version of the query is executed using both the original form of the word and the transformation of the word.
    • 提供了用于确定何时以及如何将查询中的单词转换为多个或非复数形式的技术,以便在最小化计算开销的同时提供最相关的搜索结果。 基于在指定数量的先前最频繁的搜索查询中使用的词来生成字典,并且包括从多个到单数和单数到多个的变换的列表。 基于语言建模,从字典中删除不必要的转换。 要转换的词是通过查找查询的最后一个不间断的可重写词来确定的。 在搜索文档中确认转换词的上下文,并且使用单词的原始形式和单词的转换来执行查询的版本。