会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 9. 发明授权
    • Methods, apparatus and computer programs for characterizing web resources
    • 用于表征网络资源的方法,设备和计算机程序
    • US07516397B2
    • 2009-04-07
    • US10901275
    • 2004-07-28
    • Sachindra JoshiRaghuram KrishnapuramShourya Roy
    • Sachindra JoshiRaghuram KrishnapuramShourya Roy
    • G06F17/00
    • G06F17/30864G06F17/30896
    • Methods, apparatus and computer programs are provided for characterizing Web-based information resources based on their interactions. A Web-based information resource is a single Web document or a collection of related Web documents. Unlike simple text documents, Web documents contain hyperlinks and other HTML tags. Different types of interactions, including inbound hyperlinks, outbound hyperlinks and internal links associated with a Web-based information resource, are used to characterize the Web-based information resource. A DOM tree representing the tag structure of a Web-based information resource is used to identify text items likely to be useful as context for a hyperlink anchor text, and the anchor text is combined with the context to generate a representation. The representation of Web-based information resources based on interactions can be used for clustering and classification, and in Web mining applications such as query disambiguation and automatic taxonomy generation.
    • 提供方法,装置和计算机程序,用于基于它们的相互作用来表征基于Web的信息资源。 基于Web的信息资源是单个Web文档或相关Web文档的集合。 与简单的文本文档不同,Web文档包含超链接和其他HTML标签。 使用不同类型的交互,包括入站超链接,出站超链接和与基于Web的信息资源相关联的内部链接,用于表征基于Web的信息资源。 代表基于Web的信息资源的标签结构的DOM树用于识别可能作为超链接锚文本的上下文有用的文本项,并且锚文本与上下文组合以生成表示。 基于互动的基于Web的信息资源的表示可以用于聚类和分类,以及Web挖掘应用程序,如查询消歧和自动分类法生成。
    • 10. 发明授权
    • Method and apparatus for populating a predefined concept hierarchy or other hierarchical set of classified data items by minimizing system entrophy
    • 用于通过最小化系统萎缩来填充预定义概念层级或其他分层数据集合的方法和装置
    • US07320000B2
    • 2008-01-15
    • US10309612
    • 2002-12-04
    • Krishna Prasad ChitrapuraRaghuram KrishnapuramSachindra Joshi
    • Krishna Prasad ChitrapuraRaghuram KrishnapuramSachindra Joshi
    • G06F7/10
    • G06F17/30Y10S707/99937
    • A system and method for automated populating of an existing concept hierarchy of items with new items, using entropy as a measure of the correctness of a potential classification. User-defined concept hierarchies include, for example, document hierarchies such as directories for the Internet, library catalogues, patent databases and journals, and product hierarchies. These concept hierarchies can be huge and are usually maintained manually. An internet directory may have, for example, millions of Web sites, thousands of editors and hundreds of thousands of different categories. The method for populating a concept hierarchy includes calculating conditional ‘entropy’ values representing the randomness of distribution of classification attributes for the hierarchical set of classes if a new item is added to specific classes of the hierarchy and then selecting whichever class has the minimum randomness of distribution when calculated as a condition of insertion of the new data item.
    • 一种使用熵作为潜在分类正确性的量度来自动填充具有新项目的项目的现有概念层次结构的系统和方法。 用户定义的概念层次结构包括例如文档层次结构,例如因特网的目录,图书馆目录,专利数据库和期刊以及产品层次结构。 这些概念层次结构可以是巨大的,通常是手动维护的。 互联网目录可能具有数百万个网站,数千个编辑者和数十万个不同类别。 用于填充概念层次的方法包括:如果将新项目添加到层级的特定类别,然后选择哪个类别具有最小随机性,则计算表示分级集合类的分类属性的分布随机性的条件“熵值” 当作为插入新数据项的条件计算时的分配。