专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明申请

US20130031461A1 DETECTING REPEAT PATTERNS ON A WEB PAGE 有权
标题翻译：在网页上检测重复模式
公开(公告)号：US20130031461A1
公开(公告)日：2013-01-31
申请号：US13220351
申请日：2011-08-29
申请人： Hui-Man Hou , Jian-Ming Jin , Li-Mei Jiao , Suk Hwan Lin
发明人： Hui-Man Hou , Jian-Ming Jin , Li-Mei Jiao , Suk Hwan Lin
IPC分类号： G06F17/00
CPC分类号： G06F17/30536 , G06F17/2247 , G06F17/30702
摘要： An exemplary embodiment of the present may generate a DOM-tree and generate a signal based on the DOM-tree and a node list. The signal may be analyzed and nodes may be selected within the signal to form a periodic wave. Repeat patterns may be detected using the periodic wave and the nodes.
摘要翻译：本示例性实施例可以生成DOM树并且基于DOM树和节点列表生成信号。可以分析信号并且可以在信号内选择节点以形成周期波。可以使用周期波和节点来检测重复模式。

2. 发明授权

US08560940B2 Detecting repeat patterns on a web page using signals 有权
标题翻译：使用信号检测网页上的重复模式
公开(公告)号：US08560940B2
公开(公告)日：2013-10-15
申请号：US13220351
申请日：2011-08-29
申请人： Hui-Man Hou , Jian-Ming Jin , Li-Mei Jiao , Suk Hwan Lim
发明人： Hui-Man Hou , Jian-Ming Jin , Li-Mei Jiao , Suk Hwan Lim
IPC分类号： G06F17/00
CPC分类号： G06F17/30536 , G06F17/2247 , G06F17/30702
摘要： An exemplary embodiment of the present may generate a DOM-tree and generate a signal based on the DOM-tree and a node list. The signal may be analyzed and nodes may be selected within the signal to form a periodic wave. Repeat patterns may be detected using the periodic wave and the nodes.
摘要翻译：本示例性实施例可以生成DOM树并且基于DOM树和节点列表生成信号。可以分析信号并且可以在信号内选择节点以形成周期波。可以使用周期波和节点来检测重复模式。

3. 发明授权

US09405750B2 Discrete wavelet transform method for document structure similarity 有权
标题翻译：用于文档结构相似度的离散小波变换方法
公开(公告)号：US09405750B2
公开(公告)日：2016-08-02
申请号：US14347572
申请日：2011-10-31
申请人： Li-Mei Jiao , Jerry J. Liu , Hui-Man Hou , Cong-Lei Yao
发明人： Li-Mei Jiao , Jerry J. Liu , Hui-Man Hou , Cong-Lei Yao
IPC分类号： G06F17/30 , G06F17/14 , G06F17/22
CPC分类号： G06F17/30011 , G06F17/148 , G06F17/2211 , G06F17/2247 , G06F17/30864 , G06F17/30896
摘要： Examples of the present disclosure may include methods, systems, and computer readable media with executable instructions. An example method for determining document structure similarity can include segmenting path sequences (206) of Document Object Model (DOM) trees (120, 462) from a number of web pages (202) into B components (561). Path signals (210) corresponding to the path sequences (206) are determined based on a count of the occurrences of particular paths in the Bthe component (571), and unique path signals (210) are transformed into discrete wavelet signals (214)(572). The discrete wavelet signals (214) are analyzed at multiple DOM tree resolution levels (573).
摘要翻译：本公开的示例可以包括具有可执行指令的方法，系统和计算机可读介质。用于确定文档结构相似性的示例性方法可以包括从多个网页（202）分割成B组件（561）的文档对象模型（DOM）树（120,462）的路径序列（206）。基于路径序列（206）对应的路径信号（210）是基于B组分（571）中的特定路径的出现次数确定的，唯一路径信号（210）被转换成离散小波信号（214）（ 572）。在多个DOM树分辨率级别（573）分析离散小波信号（214）。

4. 发明申请

US20120059859A1 Data Extraction Method, Computer Program Product and System 有权
标题翻译：数据提取方法，计算机程序产品与系统
公开(公告)号：US20120059859A1
公开(公告)日：2012-03-08
申请号：US13258480
申请日：2009-11-25
申请人： Li-Mei Jiao , Yuhong Xiong
发明人： Li-Mei Jiao , Yuhong Xiong
IPC分类号： G06F17/30
CPC分类号： G06F17/30896
摘要： Disclosed is a method of automatically extracting data from a target web page, comprising selecting (302) data in a source web page; determining (304) the respective DOM (document object model) trees of the source and target web page, and identifying the one or more nodes comprising the selected data in the source web page DOM tree; determining (306) matching paths in the respective DOM trees; for selected data in a node of an unmatched branch of the source web page DOM tree, identifying (308) the nearest matched path in the source web page; identifying (310) the unmatched branch nearest to the corresponding matched path in the target web page; determining (312) if said identified unmatched branch in the target web page DOM tree comprises a target node matching the selected data node; and if so: extracting (322) data from the target node if the mismatch between the respective unmatched branches does not exceed a predefined threshold. A computer program product and system implementing this method are also disclosed.
摘要翻译：公开了一种从目标网页自动提取数据的方法，包括在源网页中选择（302）数据; 确定（304）源和目标网页的相应DOM（文档对象模型）树，以及在源网页DOM树中标识包括所选数据的一个或多个节点; 确定（306）相应DOM树中的匹配路径; 对于源网页DOM树的不匹配分支的节点中的选定数据，识别（308）源网页中最近的匹配路径; 识别（310）最接近目标网页中相应匹配路径的不匹配分支; 确定（312）如果所述目标网页DOM树中的所述识别的不匹配分支包括与所选数据节点匹配的目标节点; 如果是：如果各个不匹配的分支之间的不匹配没有超过预定义的阈值，则从目标节点提取（322）数据。还公开了一种实现该方法的计算机程序产品和系统。

5. 发明授权

US09489161B2 Automatic selection of web page objects for printing 有权
标题翻译：自动选择要打印的网页对象
公开(公告)号：US09489161B2
公开(公告)日：2016-11-08
申请号：US14353234
申请日：2011-10-25
申请人： Ping Luo , Li-Mei Jiao , Zhang-Hui Chen , Huiman Hou
发明人： Ping Luo , Li-Mei Jiao , Zhang-Hui Chen , Huiman Hou
IPC分类号： G06F3/12 , G06F17/21
CPC分类号： G06F3/1273 , G06F17/218 , G06F2216/17
摘要： A method includes receiving a request to print a current web page. A set of records that represent web pages that are similar to the current web page are identified from a print log that includes at least one record, each record including an indication of a web page and indicating one or more objects that had been previously selected for printing from that web page. Based on the objects that are indicated by the identified set of records, one or more objects of the current web page are selected to be printed on a printer.
摘要翻译：一种方法包括接收打印当前网页的请求。表示与当前网页类似的网页的一组记录从包括至少一个记录的打印日志中识别，每个记录包括网页的指示，并指示先前已经选择的一个或多个对象从该网页打印。基于由所识别的记录集指示的对象，当前网页的一个或多个对象被选择打印在打印机上。

6. 发明授权

US08667015B2 Data extraction method, computer program product and system 有权
标题翻译：数据提取方法，计算机程序产品和系统
公开(公告)号：US08667015B2
公开(公告)日：2014-03-04
申请号：US13258480
申请日：2009-11-25
申请人： Li-Mei Jiao , Yuhong Xiong
发明人： Li-Mei Jiao , Yuhong Xiong
IPC分类号： G06F17/30
CPC分类号： G06F17/30896
摘要： Disclosed is a method of automatically extracting data from a target web page, comprising selecting (302) data in a source web page; determining (304) the respective DOM (document object model) trees of the source and target web page, and identifying the one or more nodes comprising the selected data in the source web page DOM tree; determining (306) matching paths in the respective DOM trees; for selected data in a node of an unmatched branch of the source web page DOM tree, identifying (308) the nearest matched path in the source web page; identifying (310) the unmatched branch nearest to the corresponding matched path in the target web page; determining (312) if said identified unmatched branch in the target web page DOM tree comprises a target node matching the selected data node; and if so: extracting (322) data from the target node if the mismatch between the respective unmatched branches does not exceed a predefined threshold. A computer program product and system implementing this method are also disclosed.
摘要翻译：公开了一种从目标网页自动提取数据的方法，包括在源网页中选择（302）数据; 确定（304）源和目标网页的相应DOM（文档对象模型）树，以及在源网页DOM树中标识包括所选数据的一个或多个节点; 确定（306）相应DOM树中的匹配路径; 对于源网页DOM树的不匹配分支的节点中的选定数据，识别（308）源网页中最近的匹配路径; 识别（310）最接近目标网页中相应匹配路径的不匹配分支; 确定（312）如果所述目标网页DOM树中的所述识别的不匹配分支包括与所选数据节点匹配的目标节点; 如果是：如果各个不匹配的分支之间的不匹配没有超过预定义的阈值，则从目标节点提取（322）数据。还公开了一种实现该方法的计算机程序产品和系统。

7. 发明申请

US20150324091A1 DETECTING VALUABLE SECTIONS IN WEBPAGE 审中-公开
标题翻译：检测WEBPAGE中的有价值部分
公开(公告)号：US20150324091A1
公开(公告)日：2015-11-12
申请号：US14375834
申请日：2012-04-28
申请人： Li-Mei Jiao , Xifei HUANG , Ping LUO
发明人： Li-Mei Jiao , Xifei HUANG , Ping LUO
IPC分类号： G06F3/0484 , G06F17/30
CPC分类号： G06F3/04842 , G06F16/954 , G06F16/957
摘要： A method for detecting a valuable section within a web page is disclosed. The method comprises: receiving an input webpage; and detecting a valuable section in the input webpage based on a user log of a reference webpage associated with the input webpage, wherein said user log comprises a path of a section within the reference webpage that was accessed by a user in a DOM-tree that represents said reference webpage.
摘要翻译：公开了一种用于检测网页内的有价值部分的方法。该方法包括：接收输入网页; 以及基于与所述输入网页相关联的参考网页的用户日志来检测所述输入网页中的有价值部分，其中所述用户日志包括由所述参考网页中的用户在DOM树中访问的部分的路径，表示参考网页。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式