会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明申请
    • QUERY PROCESSING FOR WEB SEARCH
    • 用于WEB搜索的查询处理
    • US20110264647A1
    • 2011-10-27
    • US13175797
    • 2011-07-01
    • Yumao LuNawaaz AhmedFuchun PengMarco Zagha
    • Yumao LuNawaaz AhmedFuchun PengMarco Zagha
    • G06F17/30
    • G06F17/30867G06F17/3064
    • A computer-implemented method for processing user entered query data to improve results of a search of pages using a database, when searching the internet, is disclosed. The method includes receiving the user entered query data and parsing each word of the query data and segmenting words using probability to determine a likelihood that the word is for a particular name. And, associating the particular names with a name tag to create one or more tagged name terms. Then, normalizing each of the tagged name terms and the normalizing including boosting information if found in the database and determining proximity between selected ones of the tagged name terms. The method then generates an optimized search query that incorporates normalized terms and operators. The optimized search query being applied to the internet to enable search results to be produced and displayed to the user in response to the entered query data.
    • 公开了一种用于处理用户输入的查询数据以在搜索互联网时改进使用数据库的页面的搜索结果的计算机实现的方法。 该方法包括接收用户输入的查询数据和解析查询数据的每个单词,并使用概率来分割单词以确定单词用于特定名称的可能性。 并且,将特定名称与名称标签相关联以创建一个或多个标记名称术语。 然后,对每个标记的名称术语进行归一化,并且如果在数据库中找到,则标准化包括增强信息,并确定所选择的标记名称术语之间的接近度。 该方法然后生成一个优化的搜索查询,其中包含标准化术语和运算符。 优化的搜索查询被应用于互联网,以响应于输入的查询数据来产生和显示给用户的搜索结果。
    • 2. 发明授权
    • Local query identification and normalization for web search
    • 网页搜索的本地查询识别和规范化
    • US07769746B2
    • 2010-08-03
    • US12015448
    • 2008-01-16
    • Yumao LuNawaaz AhmedFuchun PengMarco Zagha
    • Yumao LuNawaaz AhmedFuchun PengMarco Zagha
    • G06F17/30
    • G06F17/30867G06F17/3064
    • Computer-implemented methods and systems for processing user entered query data to improve results of a search of pages using a local search database are provided, when searching the internet. The method includes receiving the user entered query data and parsing each word of the query data and examining each word to determine if the word is associated with one of a business name, a city name or a state name. The examining uses probabilistic dictionaries to determine a likelihood that the word is for a particular term or intent. The method further includes normalizing each of the tagged business terms. The normalizing includes boosting information if found in the local search database and determining proximity between selected ones of the tagged terms. Then, generating an optimized internal search query that incorporates constraints and ranking based on at least the boosting information and the determined proximity between the selected tagged terms. The optimized internal search query is applied to the internet to enable search results to be produced and displayed to the user in response to the entered query data.
    • 当搜索互联网时,提供了用于处理用户输入的查询数据以改进使用本地搜索数据库的页面搜索结果的计算机实现的方法和系统。 该方法包括接收用户输入的查询数据并解析查询数据的每个单词并检查每个单词以确定该单词是否与商务名称,城市名称或州名称之一相关联。 检查使用概率词典来确定该词是针对特定术语或意图的可能性。 该方法还包括标准化每个标记的业务项。 归一化包括在本地搜索数据库中找到的增强信息,并确定所标记的条款中所选择的一个之间的接近度。 然后,生成优化的内部搜索查询,该内部搜索查询至少基于提升信息和确定的所选标记项目之间的接近度而合并约束和排序。 优化的内部搜索查询被应用于互联网,以便响应于输入的查询数据而产生并显示给用户的搜索结果。
    • 3. 发明申请
    • Method and Apparatus for Performing Multi-Phase Ranking of Web Search Results by Re-Ranking Results Using Feature and Label Calibration
    • 通过使用特征和标签校准重新排列结果来执行网页搜索结果的多阶段排序的方法和装置
    • US20090132515A1
    • 2009-05-21
    • US11942410
    • 2007-11-19
    • Yumao LuFuchun PengXin LiNawaaz Ahmed
    • Yumao LuFuchun PengXin LiNawaaz Ahmed
    • G06F17/30
    • G06F16/951G06F16/335
    • A method and apparatus for performing multi-phase ranking of web search results by re-ranking results using feature and label calibration are provided. According to one embodiment of the invention, a ranking function is trained by using machine learning techniques on a set of training samples to produce ranking scores. The ranking function is used to rank the set of training samples according to its ranking score, in order of its relevance to a particular query. Next, a re-ranking function is trained by the same training samples to re-rank the documents from the first ranking. The features and labels of the training samples are calibrated and normalized before they are reused to train the re-ranking function. By this method, training data and training features used in past trainings are leveraged to perform additional training of new functions, without requiring the use of additional training data or features.
    • 提供了一种通过使用特征和标签校准重新排列结果来执行网络搜索结果的多阶段排序的方法和装置。 根据本发明的一个实施例,通过在一组训练样本上使用机器学习技术来训练排名功能以产生排名分数。 排序函数用于根据其与特定查询的相关性,根据其排名得分对训练样本集进行排序。 接下来,通过相同的训练样本来训练重新排序功能以从第一等级重新排列文档。 培训样本的特征和标签在重新使用之前进行校准和归一化,以训练重新排序功能。 通过这种方法,可以利用过去培训中使用的训练数据和训练特征来执行新功能的附加训练,而不需要使用额外的训练数据或特征。
    • 4. 发明申请
    • QUERY IDENTIFICATION AND NORMALIZATION FOR WEB SEARCH
    • 网页搜索的查询和标准化
    • US20100257150A1
    • 2010-10-07
    • US12818036
    • 2010-06-17
    • Yumao LuNawaaz AhmedFuchun PengMarco Zagha
    • Yumao LuNawaaz AhmedFuchun PengMarco Zagha
    • G06F17/30
    • G06F17/30867G06F17/3064
    • A computer-implemented method for processing user entered query data to improve results of a search of pages using a local search database, when searching the internet, is disclosed. The method includes receiving the user entered query data and parsing each word of the query data and segmenting words using a probabilistic dictionary to determine a likelihood that the word is for a particular name. And, associating the particular names with a name tag to create one or more tagged name terms. Then, normalizing each of the tagged name terms and the normalizing including boosting information if found in the local search database and determining proximity between selected ones of the tagged name terms. The method then generates an optimized search query that incorporates normalized terms and operators. The optimized search query being applied to the internet to enable search results to be produced and displayed to the user in response to the entered query data.
    • 公开了一种用于处理用户输入的查询数据以便在搜索互联网时使用本地搜索数据库搜索页面的结果的计算机实现的方法。 该方法包括接收用户输入的查询数据和解析查询数据的每个单词并使用概率词典分割单词,以确定单词对于特定名称的可能性。 并且,将特定名称与名称标签相关联以创建一个或多个标记名称术语。 然后,对每个标记的名称术语进行归一化,并且如果在本地搜索数据库中找到,则包括增强信息的归一化,并且确定所选标记的名称术语之间的接近度。 该方法然后生成一个优化的搜索查询,其中包含标准化术语和运算符。 优化的搜索查询被应用于互联网,以响应于输入的查询数据来产生和显示给用户的搜索结果。
    • 5. 发明申请
    • SEARCH QUERY DISAMBIGUATION
    • 搜索查询退出
    • US20100205198A1
    • 2010-08-12
    • US12367114
    • 2009-02-06
    • Gilad MishneRaymond StataFuchun Peng
    • Gilad MishneRaymond StataFuchun Peng
    • G06F17/30
    • G06F16/3346G06F16/951
    • Disclosed herein is a system and method of query disambiguation. At least one model is generated using training data, which model can be used to score, or rank, possible interpretations identified for a query, which can be used to select an interpretation from a number of possible interpretations. A selected interpretation can be used to process a web search request, e.g., to generate search results that relate to the selected query interpretation, rank or order the items in the search result based on relevance to the selected query interpretation, and/or identify a presentation to be used to display the search results based on the selected query interpretation.
    • 这里公开了一种查询消歧的系统和方法。 使用训练数据生成至少一个模型,该模型可用于对查询识别的可能解释进行评分或排名,该解释可用于从多种可能的解释中选择解释。 选择的解释可以用于处理网络搜索请求,例如,基于与所选择的查询解释的相关性,生成与所选择的查询解释相关的搜索结果,搜索结果中的项目排序或排序,和/或识别 演示文稿用于根据所选择的查询解释显示搜索结果。
    • 6. 发明授权
    • Techniques for navigational query identification
    • 导航查询识别技术
    • US07693865B2
    • 2010-04-06
    • US11514076
    • 2006-08-30
    • Yumao LuFuchun PengXin LiNawaaz Ahmed
    • Yumao LuFuchun PengXin LiNawaaz Ahmed
    • G06F17/00
    • G06K9/623G06F17/30707G06F17/30864G06K9/6278
    • To accurately classify a query as navigational, thousands of available features are explored, extracted from major commercial search engine results, user Web search click data, query log, and the whole Web's relational content. To obtain the most useful features for navigational query identification, a three level system is used which integrates feature generation, feature integration, and feature selection in a pipeline. Because feature selection plays a key role in classification methodologies, the best feature selection method is coupled with the best classification approach to achieve the best performance for identifying navigational queries. According to one embodiment, linear Support Vector Machine (SVM) is used to rank features and the top ranked features are fed into a Stochastic Gradient Boosting Tree (SGBT) classification method for identifying whether or not a particular query is a navigational query.
    • 为了将查询精确地分类为导航,从主要商业搜索引擎结果,用户Web搜索点击数据,查询日志和整个Web的关系内容中提取出数千种可用功能。 为了获得导航查询识别最有用的功能,使用了一个三级系统,将特征生成,特征集成和特征选择集成在一条流水线中。 因为特征选择在分类方法中起着关键作用,因此最好的特征选择方法与最佳分类方法相结合,以实现识别导航查询的最佳性能。 根据一个实施例,使用线性支持向量机(SVM)对特征进行排序,并且将顶级特征馈送到用于识别特定查询是否是导航查询的随机渐变增强树(SGBT)分类方法中。
    • 7. 发明申请
    • ABBREVIATION HANDLING IN WEB SEARCH
    • 网页搜索缩减处理
    • US20090259629A1
    • 2009-10-15
    • US12103126
    • 2008-04-15
    • Xing WeiFuchun PengBenoit Dumoulin
    • Xing WeiFuchun PengBenoit Dumoulin
    • G06F17/30
    • G06F17/30672
    • A method for handling abbreviations in web queries includes building a dictionary of a plurality of possible word expansions for a plurality of potential abbreviations related to query terms received or anticipated to be received by a search engine; accepting a query including an abbreviation; expanding the abbreviation into one of the plurality of word expansions if a probability that the expansion is correct is above a threshold value, wherein the probability is determined by taking into consideration a context of the abbreviation within the query, wherein the context including at least anchor text; and sending the query with the expanded abbreviation to the search engine to generate a search results page related to the query.
    • 一种用于处理网页查询中的缩写的方法包括为与搜索引擎接收或预期接收的查询词相关的多个潜在缩写构建多个可能的词扩展的字典; 接受包括缩写的查询; 如果扩展正确的概率高于阈值,则将缩写扩展为多个字扩展中的一个,其中,通过考虑查询内的缩写的上下文来确定概率,其中,上下文至少包括锚 文本; 并将具有扩展缩写的查询发送到搜索引擎以生成与查询相关的搜索结果页面。
    • 8. 发明申请
    • Predicting results for input data based on a model generated from clusters
    • 基于从集群生成的模型预测输入数据的结果
    • US20070282591A1
    • 2007-12-06
    • US11445587
    • 2006-06-01
    • Fuchun Peng
    • Fuchun Peng
    • G06F17/28
    • G06F17/2775G06F17/278G06F17/2863
    • A method for predicting results for input data based on a model that is generated based on clusters of related characters, clusters of related segments, and training data. The method comprises receiving a data set that includes a plurality of words in a particular language. In the particular language, words are formed by characters. Clusters of related characters are formed from the data set. A model is generated based at least on the clusters of related characters and training data. The model may also be based on the clusters of related segments. The training data includes a plurality of entries, wherein each entry includes a character and a designated result for said character. A set of input data that includes characters that have not been associated with designated results is received. The model is applied to the input data to determine predicted results for characters within the input data.
    • 一种用于基于基于相关字符的集群,相关段的集群和训练数据生成的模型来预测输入数据的结果的方法。 该方法包括接收包含特定语言的多个单词的数据集。 在特定的语言中,单词由字符组成。 相关字符群由数据集形成。 至少基于相关字符和训练数据的集群生成模型。 该模型也可以基于相关段的集群。 训练数据包括多个条目,其中每个条目包括字符和所述字符的指定结果。 接收到一组包含尚未与指定结果相关联的字符的输入数据。 该模型应用于输入数据,以确定输入数据中字符的预测结果。
    • 9. 发明授权
    • Normalizing query words in web search
    • 在网页搜索中规范化查询词
    • US08010547B2
    • 2011-08-30
    • US12103382
    • 2008-04-15
    • Fuchun PengGeorge H. MillsBenoit Dumoulin
    • Fuchun PengGeorge H. MillsBenoit Dumoulin
    • G06F17/30
    • G06F17/277Y10S707/99931Y10S707/99932Y10S707/99933
    • A method for normalizing query words in web search includes populating a dictionary with join and split candidates and corresponding joined and split words from an aggregate of query logs; determining a confidence score for join and split candidates, a highest confidence score for each being characterized in the dictionary as must-join and must-split, respectively; accepting queries with words amenable to being split or joined, or amenable to an addition or deletion of a hyphen or an apostrophe; generating, based on the accepted queries, split candidates obtained from the dictionary, and candidates of join, hyphen, or apostrophe algorithmically; and submitting to a search engine the generated possible candidates characterized as must-join or must-split in the dictionary, to improve search results returned in response to the queries; applying a language dictionary to generated candidates not characterized as must-split or must-join, to rank them, and submitting those highest-ranked to the search engine.
    • 用于在网页搜索中归一化查询词的方法包括:从查询日志的聚合中填入具有连接和分离候选的词典和对应的连接和分割词; 确定联合和分裂候选人的置信度分数,每个词典的最高置信度分数分别表示为必须连接和必须分裂; 接受具有适合分裂或加入的词语的查询,或适合添加或删除连字符或撇号; 基于所接受的查询,从词典中分离出候选者,并且以算术方式生成加入,连字符或撇号的候选者; 并向搜索引擎提交产生的​​可能的候选人,其特征在于字典中必须加入或必须拆分,以改善响应于查询返回的搜索结果; 将语言字典应用于未被表征为必须拆分或必须加入的生成候选者,以便将其排在最高级别的搜索引擎中。
    • 10. 发明申请
    • PREDICTIVE PERSON NAME VARIANTS FOR WEB SEARCH
    • 用于网络搜索的预测人员变量
    • US20100312778A1
    • 2010-12-09
    • US12480628
    • 2009-06-08
    • Yumao LUFuchun PENGBenoit DUMOULIN
    • Yumao LUFuchun PENGBenoit DUMOULIN
    • G06F17/30
    • G06F16/3322
    • Techniques for determining when and which name variant candidates to use to re-write a search query that includes a person's name in order to provide the most relevant search results are provided. A determination is made whether a person name is present in a search query request entered by a user. Name variant candidates are generated for each person name. Then, the name variant candidates are ranked for each person name based upon one or more models that calculate a probability value for each name variant candidate. Based upon these rankings, the query may be re-written to include the original person name and a specified number of top ranked name variant candidates to present the user with the most relevant search results.
    • 提供了用于确定什么时候和哪个名称变体候选人用于重写包括人的名称以便提供最相关的搜索结果的搜索查询的技术。 确定用户输入的搜索查询请求中是否存在人名。 为每个人的姓名生成姓名变体候选人。 然后,基于计算每个名称变体候选者的概率值的一个或多个模型,为每个人名称对名称变体候选者进行排名。 基于这些排名,可以重写该查询以包括原始人姓名和指定数量的排名最高的名称变体候选人,以向用户呈现最相关的搜索结果。