专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US08812493B2 Search results ranking using editing distance and document information 有权
标题翻译：使用编辑距离和文档信息搜索结果排名
公开(公告)号：US08812493B2
公开(公告)日：2014-08-19
申请号：US12101951
申请日：2008-04-11
申请人： Vladimir Tankovich , Hang Li , Dmitriy Meyerzon , Jun Xu
发明人： Vladimir Tankovich , Hang Li , Dmitriy Meyerzon , Jun Xu
IPC分类号： G06F7/00
CPC分类号： G06F17/2211 , G06F17/30864
摘要： Architecture for extracting document information from documents received as search results based on a query string, and computing an edit distance between the data string and the query string. The edit distance is employed in determining relevance of the document as part of result ranking by detecting near-matches of a whole query or part of the query. The edit distance evaluates how close the query string is to a given data stream that includes document information such as TAUC (title, anchor text, URL, clicks) information, etc. The architecture includes the index-time splitting of compound terms in the URL to allow the more effective discovery of query terms. Additionally, index-time filtering of anchor text is utilized to find the top N anchors of one or more of the document results. The TAUC information can be input to a neural network (e.g., 2-layer) to improve relevance metrics for ranking the search results.
摘要翻译：用于基于查询字符串从作为搜索结果接收的文档提取文档信息的结构，以及计算数据串和查询字符串之间的编辑距离。编辑距离用于通过检测整个查询或部分查询的近似匹配来确定文档作为结果排名的一部分的相关性。编辑距离评估查询字符串与包含诸如TAUC（标题，锚文本，URL，点击）信息等文档信息的给定数据流的距离。该体系结构包括索引时间分割URL中的复合术语以便更有效地发现查询条款。另外，使用锚文本的索引时间过滤来查找一个或多个文档结果的前N个锚点。可以将TAUC信息输入到神经网络（例如，2层），以改进用于对搜索结果排序的相关性度量。

2. 发明申请

US20090259651A1 SEARCH RESULTS RANKING USING EDITING DISTANCE AND DOCUMENT INFORMATION 有权
标题翻译：搜索结果使用编辑距离和文档信息排名
公开(公告)号：US20090259651A1
公开(公告)日：2009-10-15
申请号：US12101951
申请日：2008-04-11
申请人： Vladimir Tankovich , Hang Li , Dmitriy Meyerzon , Jun Xu
发明人： Vladimir Tankovich , Hang Li , Dmitriy Meyerzon , Jun Xu
IPC分类号： G06F17/30
CPC分类号： G06F17/2211 , G06F17/30864
摘要： Architecture for extracting document information from documents received as search results based on a query string, and computing an edit distance between the data string and the query string. The edit distance is employed in determining relevance of the document as part of result ranking by detecting near-matches of a whole query or part of the query. The edit distance evaluates how close the query string is to a given data stream that includes document information such as TAUC (title, anchor text, URL, clicks) information, etc. The architecture includes the index-time splitting of compound terms in the URL to allow the more effective discovery of query terms. Additionally, index-time filtering of anchor text is utilized to find the top N anchors of one or more of the document results. The TAUC information can be input to a neural network (e.g., 2-layer) to improve relevance metrics for ranking the search results.
摘要翻译：用于基于查询字符串从作为搜索结果接收的文档提取文档信息的结构，以及计算数据串和查询字符串之间的编辑距离。编辑距离用于通过检测整个查询或部分查询的近似匹配来确定文档作为结果排名的一部分的相关性。编辑距离评估查询字符串与包含诸如TAUC（标题，锚文本，URL，点击）信息等文档信息的给定数据流的距离。该体系结构包括索引时间分割URL中的复合术语以便更有效地发现查询条款。另外，使用锚文本的索引时间过滤来查找一个或多个文档结果的前N个锚点。可以将TAUC信息输入到神经网络（例如，2层），以改进用于对搜索结果排序的相关性度量。

3. 发明申请

US20100228711A1 Enterprise Search Method and System 有权
标题翻译：企业搜索方法与系统
公开(公告)号：US20100228711A1
公开(公告)日：2010-09-09
申请号：US12391484
申请日：2009-02-24
申请人： Hang Li , Yunhua Hu , Xin Zou , Xiaoyuan Cui , Guangping Gao , Dmitriy Meyerzon , Victor Poznanski
发明人： Hang Li , Yunhua Hu , Xin Zou , Xiaoyuan Cui , Guangping Gao , Dmitriy Meyerzon , Victor Poznanski
IPC分类号： G06F7/06 , G06F17/30 , G06F3/048
CPC分类号： G06F17/30867 , G06F17/30528 , G06F17/3053 , G06F17/30554 , G06F17/30864
摘要： A system and method for enterprise search includes one or more computer-readable media storing computer-executable instructions that, when executed on one or more processors that perform acts including extracting one or more of term data, personal data and metadata from one or more predetermined resources; retrieving a set of information derived from the extracted term data, personal data and metadata responsive to a query; and receiving feedback responsive to the set of information, the feedback augmenting at least one of the one or more predetermined resources.
摘要翻译：用于企业搜索的系统和方法包括存储计算机可执行指令的一个或多个计算机可读介质，所述计算机可执行指令当在执行动作的一个或多个处理器上执行时，包括从一个或多个预定的资源; 从所提取的术语数据，响应于查询的个人数据和元数据检索一组信息; 以及响应于所述一组信息接收反馈，所述反馈增加所述一个或多个预定资源中的至少一个。

4. 发明授权

US07716198B2 Ranking search results using feature extraction 失效
标题翻译：使用特征提取排列搜索结果
公开(公告)号：US07716198B2
公开(公告)日：2010-05-11
申请号：US11019091
申请日：2004-12-21
申请人： Dmitriy Meyerzon , Hang Li
发明人： Dmitriy Meyerzon , Hang Li
IPC分类号： G06F17/30
CPC分类号： G06F17/30684
摘要： Methods and computer-readable media are provided for ranking search results using feature extraction data. Each of the results of a search engine query is parsed to obtain data, such as text, formatting information, metadata, and the like. The text, the formatting information and the metadata are passed through a feature extraction application to extract data that may be used to improve a ranking of the search results based on relevance of the search results to the search engine query. The feature extraction application extracts features, such as titles, found in any of the text based on formatting information applied to or associated with the text. The extracted titles, the text, the formatting information and the metadata for any given search results item are processed according to a field weighting application for determining a ranking of the given search results item. Ranked search results items may then be displayed according to ranking.
摘要翻译：提供方法和计算机可读介质用于使用特征提取数据对搜索结果进行排名。解析搜索引擎查询的每个结果以获得诸如文本，格式信息，元数据等的数据。文本，格式化信息和元数据通过特征提取应用程序传递，以提取可用于根据搜索结果与搜索引擎查询的相关性来提高搜索结果排名的数据。特征提取应用程序基于应用于或与文本相关联的格式化信息来提取在任何文本中找到的特征，诸如标题。根据用于确定给定搜索结果项目的排名的字段加权应用程序处理提取的标题，文本，格式化信息和用于任何给定搜索结果项目的元数据。然后可以根据排名显示排名的搜索结果项。

5. 发明申请

US20090319505A1 TECHNIQUES FOR EXTRACTING AUTHORSHIP DATES OF DOCUMENTS 审中-公开
标题翻译：提取作者日期文件的技术
公开(公告)号：US20090319505A1
公开(公告)日：2009-12-24
申请号：US12141935
申请日：2008-06-19
申请人： Hang Li , Yunhua Hu , Guangping Gao , Yauhen Shnitko , Dmitriy Meyerzon , David Mowatt
发明人： Hang Li , Yunhua Hu , Guangping Gao , Yauhen Shnitko , Dmitriy Meyerzon , David Mowatt
IPC分类号： G06F7/06 , G06F17/30
CPC分类号： G06F17/2765 , G06N3/04
摘要： Various technologies and techniques are disclosed for calculating authorship dates for a document. A portion of a document to select to look for possible authorship dates is determined. The possible authorship dates are extracted from the portion of the document. A revised authorship date of the document is generated using a neural network. The revised authorship date is returned to an application or process that requested the date.
摘要翻译：披露了各种技术和技术来计算文件的作者日期。确定要选择查找可能的作者日期的文档的一部分。可能的作者日期是从文档的部分中提取的。使用神经网络生成文档的修订作者日期。修改后的作者日期将返回给请求日期的应用程序或进程。

6. 发明申请

US20060136411A1 Ranking search results using feature extraction 失效
标题翻译：使用特征提取排列搜索结果
公开(公告)号：US20060136411A1
公开(公告)日：2006-06-22
申请号：US11019091
申请日：2004-12-21
申请人： Dmitriy Meyerzon , Hang Li
发明人： Dmitriy Meyerzon , Hang Li
IPC分类号： G06F17/30
CPC分类号： G06F17/30684
摘要： Methods and computer-readable media are provided for ranking search results using feature extraction data. Each of the results of a search engine query is parsed to obtain data, such as text, formatting information, metadata, and the like. The text, the formatting information and the metadata are passed through a feature extraction application to extract data that may be used to improve a ranking of the search results based on relevance of the search results to the search engine query. The feature extraction application extracts features, such as titles, found in any of the text based on formatting information applied to or associated with the text. The extracted titles, the text, the formatting information and the metadata for any given search results item are processed according to a field weighting application for determining a ranking of the given search results item. Ranked search results items may then be displayed according to ranking.
摘要翻译：提供方法和计算机可读介质用于使用特征提取数据对搜索结果进行排名。解析搜索引擎查询的每个结果以获得诸如文本，格式信息，元数据等的数据。文本，格式化信息和元数据通过特征提取应用程序传递，以提取可用于根据搜索结果与搜索引擎查询的相关性来提高搜索结果排名的数据。特征提取应用程序基于应用于或与文本相关联的格式化信息来提取在任何文本中找到的特征，诸如标题。根据用于确定给定搜索结果项目的排名的字段加权应用程序处理提取的标题，文本，格式化信息和用于任何给定搜索结果项目的元数据。然后可以根据排名显示排名的搜索结果项。

7. 发明授权

US09639609B2 Enterprise search method and system 有权
公开(公告)号：US09639609B2
公开(公告)日：2017-05-02
申请号：US12391484
申请日：2009-02-24
申请人： Hang Li , Yunhua Hu , Xin Zou , Xiaoyuan Cui , Guangping Gao , Dmitriy Meyerzon , Victor Poznanski
发明人： Hang Li , Yunhua Hu , Xin Zou , Xiaoyuan Cui , Guangping Gao , Dmitriy Meyerzon , Victor Poznanski
IPC分类号： G06F17/30
CPC分类号： G06F17/30867 , G06F17/30528 , G06F17/3053 , G06F17/30554 , G06F17/30864
摘要： A system and method for enterprise search includes one or more computer-readable media storing computer-executable instructions that, when executed on one or more processors that perform acts including extracting one or more of term data, personal data and metadata from one or more predetermined resources; retrieving a set of information derived from the extracted term data, personal data and metadata responsive to a query; and receiving feedback responsive to the set of information, the feedback augmenting at least one of the one or more predetermined resources.

8. 发明申请

US20060047637A1 System and method for managing information by answering a predetermined number of predefined questions 审中-公开
标题翻译：通过回答预定数量的预定义问题来管理信息的系统和方法
公开(公告)号：US20060047637A1
公开(公告)日：2006-03-02
申请号：US10932547
申请日：2004-09-02
申请人： Dmitriy Meyerzon , Hang Li , Joseph Sherman , Yunbo Cao , Zheng Chen
发明人： Dmitriy Meyerzon , Hang Li , Joseph Sherman , Yunbo Cao , Zheng Chen
IPC分类号： G06F17/30
CPC分类号： G06F16/3329 , G06F16/316 , G06F16/951
摘要： The present invention is a system for answering questions. The present invention uses a data mining module to mine data, such as enterprise data, and to configure the data to answer a predetermined number of questions each having a predefined form. The present invention also provides a user interface component for receiving user queries and responding to those queries.
摘要翻译：本发明是用于回答问题的系统。本发明使用数据挖掘模块来挖掘诸如企业数据的数据，并且配置数据以回答每个具有预定义形式的预定数量的问题。本发明还提供了用于接收用户查询并响应于那些查询的用户界面组件。

9. 发明申请

US20090182723A1 RANKING SEARCH RESULTS USING AUTHOR EXTRACTION 审中-公开
标题翻译：使用作者提取排名搜索结果
公开(公告)号：US20090182723A1
公开(公告)日：2009-07-16
申请号：US11972613
申请日：2008-01-10
申请人： Yauhen Shnitko , Dmitriy Meyerzon , Hang Li , Yunhua Hu
发明人： Yauhen Shnitko , Dmitriy Meyerzon , Hang Li , Yunhua Hu
IPC分类号： G06F17/30
CPC分类号： G06F16/38
摘要： Architecture that extracts author information from general documents and uses the author information for search results ranking. The architecture performs automatic author value extraction and makes the extracted value available at index time for subsequent use at query processing and results ranking. Machine learning (e.g., a perceptron algorithm) is employed and a set of input features for the perceptron algorithm utilized for author value extraction. The extracted author value is converted into a feature for input a ranking function for generating a ranking score for each document. The input features can also be weighted according to weighting criteria.
摘要翻译：从一般文件中提取作者信息并使用作者信息进行搜索结果排名的架构。该架构执行自动作者价值提取，并使提取的值在索引时间可用于随后在查询处理和结果排名中使用。采用机器学习（例如，感知器算法）和用于感知器算法的用于作者价值提取的一组输入特征。提取的作者价值被转换成用于输入用于生成每个文档的排名得分的排名功能的特征。输入特征也可以根据加权标准加权。

10. 发明授权

US07469251B2 Extraction of information from documents 有权
标题翻译：从文件中提取信息
公开(公告)号：US07469251B2
公开(公告)日：2008-12-23
申请号：US11192687
申请日：2005-07-29
申请人： Hang Li , Ruihua Song , Yunbo Cao , Dmitriy Meyerzon
发明人： Hang Li , Ruihua Song , Yunbo Cao , Dmitriy Meyerzon
IPC分类号： G06F17/30
CPC分类号： G06F17/211 , Y10S707/99935
摘要： An information extraction model is trained on format features identified within labeled training documents. Information from a document is extracted by assigning labels to units based on format features of the units within the document. A begin label and end label are identified and the information is extracted between the begin label and the end label. The extracted information can be used in various document processing tasks such as ranking.
摘要翻译：对标示的培训文件中标识的格式特征进行信息提取模型的培训。通过根据文档中单位的格式特征为单位分配标签来提取文档中的信息。识别开始标签和结束标签，并在开始标签和结束标签之间提取信息。提取的信息可以用于各种文档处理任务，如排名。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式