专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US09189550B2 Query refinement in a browser toolbar 有权
标题翻译：浏览器工具栏中的查询细化
公开(公告)号：US09189550B2
公开(公告)日：2015-11-17
申请号：US13299089
申请日：2011-11-17
申请人： Timothy Edgar , Ambarish Chitnis , Ryen William White , Pavel Dmitriev , Rajanikanth Ageeru , Ovidiu Dan , Lin Tang
发明人： Timothy Edgar , Ambarish Chitnis , Ryen William White , Pavel Dmitriev , Rajanikanth Ageeru , Ovidiu Dan , Lin Tang
IPC分类号： G06F17/30
CPC分类号： G06F17/30864
摘要： Embodiment described herein are generally directed to a toolbar extension of a web browser that grabs a user's search engine query and suggests a refined search query known to yield better search results. The toolbar recognizes the web page the user is on as being associated with a search engine and retrieves the user's search query. The toolbar interacts with a refinement component on a server, and the refinement component determines a refined search query based on confidence scores assigned to data mined from a data center affiliated with different search engine (one related to the toolbar). The refined search query is returned and displayed in a search field of the toolbar, allowing the user to easily run the refined search on the different search engine.
摘要翻译：本文描述的实施例通常涉及web浏览器的工具栏扩展，其抓取用户的搜索引擎查询并且建议已知可以产生更好搜索结果的精确搜索查询。工具栏识别用户所在的网页与搜索引擎相关联并检索用户的搜索查询。工具栏与服务器上的细化组件进行交互，并且细化组件基于分配给与不同搜索引擎关联的数据中心（与工具栏相关的数据中心）挖掘的数据的置信度得分来确定精确的搜索查询。精确的搜索查询返回并显示在工具栏的搜索字段中，允许用户在不同的搜索引擎上轻松地运行精细搜索。

2. 发明授权

US08484180B2 Graph-based seed selection algorithm for web crawlers 有权
标题翻译：用于网页抓取工具的基于图形的种子选择算法
公开(公告)号：US08484180B2
公开(公告)日：2013-07-09
申请号：US12477819
申请日：2009-06-03
申请人： Pavel Dmitriev , Shuyi Zheng
发明人： Pavel Dmitriev , Shuyi Zheng
IPC分类号： G06F17/30
CPC分类号： G06F17/30864
摘要： One or more search seeds for Web crawling operations are selected. In a directed graph with Web pages represented by vertices and links represented by edges, characteristics of vertices connected to potential seed vertices are considered in making a seed selection.
摘要翻译：选择一个或多个搜索种子进行Web爬网操作。在由由顶点和由边缘表示的链接表示的网页的有向图中，在进行种子选择时考虑连接到潜在种子顶点的顶点的特征。

3. 发明申请

US20100312774A1 Graph-Based Seed Selection Algorithm For Web Crawlers 有权
标题翻译：基于图形的Web爬虫的种子选择算法
公开(公告)号：US20100312774A1
公开(公告)日：2010-12-09
申请号：US12477819
申请日：2009-06-03
申请人： Pavel Dmitriev , Shuyi Zheng
发明人： Pavel Dmitriev , Shuyi Zheng
IPC分类号： G06F17/30
CPC分类号： G06F17/30864
摘要： A method for selecting one or more search seeds for Web crawling operations is provided. In a directed graph with Web pages represented by vertices and links represented by edges, characteristics of vertices connected to potential seed vertices are considered in making a seed selection.
摘要翻译：提供了一种用于为Web爬行操作选择一个或多个搜索种子的方法。在由由顶点和由边缘表示的链接表示的网页的有向图中，在进行种子选择时考虑连接到潜在种子顶点的顶点的特征。

4. 发明授权

US09251249B2 Entity summarization and comparison 有权
标题翻译：实体总结与比较
公开(公告)号：US09251249B2
公开(公告)日：2016-02-02
申请号：US13316838
申请日：2011-12-12
申请人： Pavel Dmitriev , Wei Zhuang
发明人： Pavel Dmitriev , Wei Zhuang
IPC分类号： G06F17/30
CPC分类号： G06F17/30616 , G06F17/30687
摘要： An entity summarization system is described herein that mines the Internet and other data source to provide answers to questions such as the relative sentiment of users towards various brands. The system uses a controlled vocabulary list describing a specific aspect of entities of interest. Given an entity name, the system scans the whole content corpus to collect statistics on the words that occur most frequently in the context of the entity name, taking into account proximity information, to produce a weighted list of vocabulary terms describing the entity. Two entities can be compared by normalizing and comparing their weighted term lists. In some embodiments, the system performs these procedures efficiently by leveraging an N-gram web model. Thus, the system provides an automated way to compare two entities to derive information about how users feel about the entities at any given time.
摘要翻译：本文描述了一种实体摘要系统，它利用互联网和其他数据源，为诸如用户对各种品牌的相对情绪等问题提供答案。该系统使用描述感兴趣实体的特定方面的受控词汇表。给定一个实体名称，系统扫描整个内容语料库以收集关于在实体名称的上下文中最频繁出现的单词的统计数据，同时考虑到邻近信息，以产生描述实体的词汇术语的加权列表。通过对其加权项列表进行归一化和比较，可以比较两个实体。在一些实施例中，系统通过利用N-gram web模型有效地执行这些过程。因此，该系统提供了一种自动化的方式来比较两个实体以得出关于用户在任何给定时间对实体的感觉的信息。

5. 发明申请

US20130151538A1 ENTITY SUMMARIZATION AND COMPARISON 有权
标题翻译：实体概述和比较
公开(公告)号：US20130151538A1
公开(公告)日：2013-06-13
申请号：US13316838
申请日：2011-12-12
申请人： Pavel Dmitriev , Wei Zhuang
发明人： Pavel Dmitriev , Wei Zhuang
IPC分类号： G06F17/30
CPC分类号： G06F17/30616 , G06F17/30687
摘要： An entity summarization system is described herein that mines the Internet and other data source to provide answers to questions such as the relative sentiment of users towards various brands. The system uses a controlled vocabulary list describing a specific aspect of entities of interest. Given an entity name, the system scans the whole content corpus to collect statistics on the words that occur most frequently in the context of the entity name, taking into account proximity information, to produce a weighted list of vocabulary terms describing the entity. Two entities can be compared by normalizing and comparing their weighted term lists. In some embodiments, the system performs these procedures efficiently by leveraging an N-gram web model. Thus, the system provides an automated way to compare two entities to derive information about how users feel about the entities at any given time.
摘要翻译：本文描述了一种实体摘要系统，它利用互联网和其他数据源，为诸如用户对各种品牌的相对情绪等问题提供答案。该系统使用描述感兴趣实体的特定方面的受控词汇表。给定一个实体名称，系统扫描整个内容语料库以收集关于在实体名称的上下文中最频繁出现的单词的统计数据，同时考虑到邻近信息，以产生描述实体的词汇术语的加权列表。通过对其加权项列表进行归一化和比较，可以比较两个实体。在一些实施例中，系统通过利用N-gram web模型有效地执行这些过程。因此，该系统提供了一种自动化的方式来比较两个实体以得出关于用户在任何给定时间对实体的感觉的信息。

6. 发明申请

US20100114858A1 HOST-BASED SEED SELECTION ALGORITHM FOR WEB CRAWLERS 审中-公开
标题翻译：基于主机的网络选择算法
公开(公告)号：US20100114858A1
公开(公告)日：2010-05-06
申请号：US12259164
申请日：2008-10-27
申请人： Pavel Dmitriev
发明人： Pavel Dmitriev
IPC分类号： G06F17/30
CPC分类号： G06F16/9537
摘要： A host-based seed selection process considers factors such as quality, importance and potential yield of hosts in a decision to use a document of a host as a seed. A subset of a plurality of hosts is determined, including some but not all of the plurality of the hosts, according to an indication of importance of the hosts, according to an expected yield of new documents for the hosts, and according to preferences for the markets the hosts belong to. At least one seed is generated for each host of the determined subset of hosts, wherein each generated at least one seed includes an indication of a document in the linked database of documents. The generated seeds are provided to be accessible by a database crawler.
摘要翻译：基于主机的种子选择过程在决定使用主机的文档作为种子时考虑诸如主机的质量，重要性和潜在产量等因素。根据主机的重要性的指示，根据主机的新文档的预期收益，并且根据主机的偏好，确定多个主机的子集，包括多个主机中的一些而不是全部主机销售主机属于。为所确定的主机子集的每个主机生成至少一个种子，其中每个生成的至少一个种子包括链接的文档数据库中的文档的指示。生成的种子被提供给数据库抓取工具可访问。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式