专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US07693806B2 Classification using a cascade approach 失效
标题翻译：使用级联方法分类
公开(公告)号：US07693806B2
公开(公告)日：2010-04-06
申请号：US11766434
申请日：2007-06-21
申请人： Wen-tau Yih , Joshua T. Goodman , Geoffrey J. Hulten
发明人： Wen-tau Yih , Joshua T. Goodman , Geoffrey J. Hulten
IPC分类号： G06F15/18 , G06N3/08
CPC分类号： H04L51/12 , G06K9/6256 , G06Q10/06 , G06Q10/10
摘要： A system and method that facilitates and effectuates optimizing a classifier for greater performance in a specific region of classification that is of interest, such as a low false positive rate or a low false negative rate. A two-stage classification model can be trained and employed, where the first stage classification is optimized over the entire classification region and the second stage classifier is optimized for the specific region of interest. During training the entire set of training data is employed by a first stage classifier. Only data that is classified by the first stage classifier or by cross validation to fall within a region of interest is used to train the second stage classifier. During classification, data that is classified within the region of interest by the first classification is given the first stage classifier's classification value, otherwise the classification value for the instance of data from the second stage classifier is used.
摘要翻译：促进并实现分类器在特定感兴趣区域中的更高性能的系统和方法，例如低假阳性率或低假阴性率。可以训练和采用两阶段分类模型，其中对整个分类区域优化第一阶段分类，并针对特定的兴趣区域优化第二阶段分类器。在训练期间，整套训练数据由第一阶段分类器采用。仅使用由第一阶段分类器分类的数据或通过交叉验证落入感兴趣区域内的数据来训练第二阶段分类器。在分类期间，通过第一分类对分类在感兴趣区域内的数据给予第一阶段分类器的分类值，否则使用来自第二阶段分类器的数据实例的分类值。

2. 发明授权

US07464264B2 Training filters for detecting spasm based on IP addresses and text-related features 有权
标题翻译：培训过滤器，用于根据IP地址和文本相关功能检测痉挛
公开(公告)号：US07464264B2
公开(公告)日：2008-12-09
申请号：US10809163
申请日：2004-03-25
申请人： Joshua T. Goodman , Robert L. Rounthwaite , Geoffrey J. Hulten , Wen-tau Yih
发明人： Joshua T. Goodman , Robert L. Rounthwaite , Geoffrey J. Hulten , Wen-tau Yih
IPC分类号： H04L9/00 , G06F21/00
CPC分类号： H04L51/12 , G06Q10/107
摘要： The subject invention provides for an intelligent quarantining system and method that facilitates detecting and preventing spam. In particular, the invention employs a machine learning filter specifically trained using origination features such as an IP address as well as destination feature such as a URL. Moreover, the system and method involve training a plurality of filters using specific feature data for each filter. The filters are trained independently each other, thus one feature may not unduly influence another feature in determining whether a message is spam. Because multiple filters are trained and available to scan messages either individually or in combination (at least two filters), the filtering or spam detection process can be generalized to new messages having slightly modified features (e.g., IP address). The invention also involves locating the appropriate IP addresses or URLs in a message as well as guiding filters to weigh origination or destination features more than text-based features.
摘要翻译：本发明提供了一种便于检测和防止垃圾邮件的智能隔离系统和方法。特别地，本发明采用使用诸如IP地址之类的发起特征以及目的地特征（例如URL）专门训练的机器学习滤波器。此外，该系统和方法涉及使用针对每个滤波器的特定特征数据来训练多个滤波器。滤波器被彼此独立地训练，因此在确定消息是否是垃圾邮件时，一个特征可能不会不适当地影响另一特征。由于多个过滤器被训练并可用于单独或组合扫描消息（至少两个过滤器），因此过滤或垃圾邮件检测过程可以推广到具有稍微修改的特征（例如，IP地址）的新消息。本发明还涉及在消息中定位适当的IP地址或URL，以及引导过滤器比基于文本的特征更重要的起始或目的地特征。

3. 发明授权

US07689652B2 Using IP address and domain for email spam filtering 有权
标题翻译：使用IP地址和域进行垃圾邮件过滤
公开(公告)号：US07689652B2
公开(公告)日：2010-03-30
申请号：US11031672
申请日：2005-01-07
申请人： Manav Mishra , Elissa E. S. Murphy , Geoffrey J Hulten , Joshua T. Goodman , Wen-Tau Yih
发明人： Manav Mishra , Elissa E. S. Murphy , Geoffrey J Hulten , Joshua T. Goodman , Wen-Tau Yih
IPC分类号： G06F15/16 , G06F15/173
CPC分类号： H04L51/28 , H04L29/1215 , H04L51/12 , H04L61/1564 , H04L63/0227 , H04L63/1441
摘要： Email spam filtering is performed based on a combination of IP address and domain. When an email message is received, an IP address and a domain associated with the email message are determined. A cross product of the IP address (or portions of the IP address) and the domain (or portions of the domain) is calculated. If the email message is known to be either spam or non-spam, then a spam score based on the known spam status is stored in association with each (IP address, domain) pair element of the cross product. If the spam status of the email message is not known, then the (IP address, domain) pair elements of the cross product are used to lookup previously determined spam scores. A combination of the previously determined spam scores is used to determine whether or not to treat the received email message as spam.
摘要翻译：电子邮件垃圾邮件过滤是基于IP地址和域名的组合来执行的。当接收到电子邮件消息时，确定与电子邮件消息相关联的IP地址和域。计算IP地址（或IP地址的部分）和域（或域的部分）的交叉乘积。如果电子邮件消息被称为垃圾邮件或非垃圾邮件，则根据已知垃圾邮件状态的垃圾邮件分数与交叉产品的每个（IP地址，域）对元素相关联地存储。如果电子邮件的垃圾邮件状态未知，则交叉产品的（IP地址，域）对元素将用于查找先前确定的垃圾邮件分数。使用先前确定的垃圾邮件分数的组合来确定是否将接收的电子邮件消息视为垃圾邮件。

4. 发明申请

US20080319932A1 CLASSIFICATION USING A CASCADE APPROACH 失效
标题翻译：使用CASCADE方法进行分类
公开(公告)号：US20080319932A1
公开(公告)日：2008-12-25
申请号：US11766434
申请日：2007-06-21
申请人： Wen-tau Yih , Joshua T. Goodman , Geoffrey J. Hulten
发明人： Wen-tau Yih , Joshua T. Goodman , Geoffrey J. Hulten
IPC分类号： G06F15/18
CPC分类号： H04L51/12 , G06K9/6256 , G06Q10/06 , G06Q10/10
摘要： A system and method that facilitates and effectuates optimizing a classifier for greater performance in a specific region of classification that is of interest, such as a low false positive rate or a low false negative rate. A two-stage classification model can be trained and employed, where the first stage classification is optimized over the entire classification region and the second stage classifier is optimized for the specific region of interest. During training the entire set of training data is employed by a first stage classifier. Only data that is classified by the first stage classifier or by cross validation to fall within a region of interest is used to train the second stage classifier. During classification, data that is classified within the region of interest by the first classification is given the first stage classifier's classification value, otherwise the classification value for the instance of data from the second stage classifier is used.
摘要翻译：促进并实现分类器在特定感兴趣区域中的更高性能的系统和方法，例如低假阳性率或低假阴性率。可以训练和采用两阶段分类模型，其中对整个分类区域优化第一阶段分类，并针对特定的兴趣区域优化第二阶段分类器。在训练期间，整套训练数据由第一阶段分类器采用。仅使用由第一阶段分类器分类的数据或通过交叉验证落入感兴趣区域内的数据来训练第二阶段分类器。在分类期间，通过第一分类对分类在感兴趣区域内的数据给予第一阶段分类器的分类值，否则使用来自第二阶段分类器的数据实例的分类值。

5. 发明授权

US08135728B2 Web document keyword and phrase extraction 有权
标题翻译： Web文档关键字和短语提取
公开(公告)号：US08135728B2
公开(公告)日：2012-03-13
申请号：US11619230
申请日：2007-01-03
申请人： Wen-tau Yih , Joshua T. Goodman , Vitor Rocha de Carvalho
发明人： Wen-tau Yih , Joshua T. Goodman , Vitor Rocha de Carvalho
IPC分类号： G06F7/00 , G06F17/30 , G06F13/14
CPC分类号： G06F17/241 , G06F17/27 , G06F17/30 , G06F17/30616
摘要： Extraction analysis techniques biased, in part, by query frequency information from a query log file and/or search engine cache are employed along with machine learning processes to determine candidate keywords and/or phrases of web documents. Web oriented features associated with the candidate keywords and/or phrases are also utilized to analyze the web documents. A keyword and/or phrase extraction mechanism can be utilized to score keywords and/or phrases in a web document and estimate a likelihood that the keywords and/or phrases are relevant, for example, in an advertising system and the like.
摘要翻译：提取分析技术部分地通过来自查询日志文件和/或搜索引擎高速缓冲存储器的查询频率信息以及机器学习过程来偏移来确定web文档的候选关键字和/或短语。与候选关键字和/或短语相关联的面向Web的功能也用于分析网络文档。可以使用关键字和/或短语提取机制来评估网络文档中的关键字和/或短语，并估计关键词和/或短语相关的可能性，例如在广告系统等中。

6. 发明申请

US20080109425A1 Document summarization by maximizing informative content words 有权
标题翻译：通过最大化信息内容词汇的文档摘要
公开(公告)号：US20080109425A1
公开(公告)日：2008-05-08
申请号：US11591937
申请日：2006-11-02
申请人： Wen-tau Yih , Joshua T. Goodman , Lucretia H. Vanderwende , Hisami Suzuki
发明人： Wen-tau Yih , Joshua T. Goodman , Lucretia H. Vanderwende , Hisami Suzuki
IPC分类号： G06F17/30 , G06F15/18 , G06F9/44
CPC分类号： G06F17/30719
摘要： Document summarization is performed by scoring individual words in sentences in a document or document cluster. Sentences from the document or document cluster are selected to form a summary based on the scores of the words contained in those sentences.
摘要翻译：通过在文档或文档集群中的句子中的单个单词进行评分来执行文档摘要。选择文档或文档集合中的句子，以便根据这些句子中包含的单词的分数来形成一个摘要。

7. 发明授权

US07702680B2 Document summarization by maximizing informative content words 有权
标题翻译：通过最大化信息内容词汇的文档摘要
公开(公告)号：US07702680B2
公开(公告)日：2010-04-20
申请号：US11591937
申请日：2006-11-02
申请人： Wen-tau Yih , Joshua T. Goodman , Lucretia H. Vanderwende , Hisami Suzuki
发明人： Wen-tau Yih , Joshua T. Goodman , Lucretia H. Vanderwende , Hisami Suzuki
IPC分类号： G06F7/00 , G06F17/30
CPC分类号： G06F17/30719
摘要： Document summarization is performed by scoring individual words in sentences in a document or document cluster. Sentences from the document or document cluster are selected to form a summary based on the scores of the words contained in those sentences.
摘要翻译：通过在文档或文档集群中的句子中的单个单词进行评分来执行文档摘要。选择文档或文档集合中的句子，以便根据这些句子中包含的单词的分数来形成一个摘要。

8. 发明授权

US07930353B2 Trees of classifiers for detecting email spam 有权
标题翻译：用于检测电子邮件垃圾邮件的分类树
公开(公告)号：US07930353B2
公开(公告)日：2011-04-19
申请号：US11193691
申请日：2005-07-29
申请人： David M. Chickering , Geoffrey J. Hulten , Robert L. Rounthwaite , Christopher A. Meek , David E. Heckerman , Joshua T. Goodman
发明人： David M. Chickering , Geoffrey J. Hulten , Robert L. Rounthwaite , Christopher A. Meek , David E. Heckerman , Joshua T. Goodman
IPC分类号： G06F15/16
CPC分类号： H04L51/12
摘要： Decision trees populated with classifier models are leveraged to provide enhanced spam detection utilizing separate email classifiers for each feature of an email. This provides a higher probability of spam detection through tailoring of each classifier model to facilitate in more accurately determining spam on a feature-by-feature basis. Classifiers can be constructed based on linear models such as, for example, logistic-regression models and/or support vector machines (SVM) and the like. The classifiers can also be constructed based on decision trees. “Compound features” based on internal and/or external nodes of a decision tree can be utilized to provide linear classifier models as well. Smoothing of the spam detection results can be achieved by utilizing classifier models from other nodes within the decision tree if training data is sparse. This forms a base model for branches of a decision tree that may not have received substantial training data.
摘要翻译：利用分类器模型填充的决策树利用电子邮件的每个功能使用单独的电子邮件分类器来提供增强的垃圾邮件检测。这通过定制每个分类器模型提供了更高的垃圾邮件检测的概率，以便于在逐个特征的基础上更准确地确定垃圾邮件。分类器可以基于诸如逻辑回归模型和/或支持向量机（SVM）等线性模型来构建。分类器也可以基于决策树构建。基于决策树的内部和/或外部节点的“复合特征”也可以用于提供线性分类器模型。垃圾邮件检测结果的平滑可以通过使用来自决策树内的其他节点的分类器模型来实现，如果训练数据是稀疏的。这形成了可能没有接收到大量训练数据的决策树的分支的基本模型。

9. 发明授权

US07543053B2 Intelligent quarantining for spam prevention 有权
标题翻译：智能隔离垃圾邮件防范
公开(公告)号：US07543053B2
公开(公告)日：2009-06-02
申请号：US10779295
申请日：2004-02-13
申请人： Joshua T. Goodman , Robert L. Rounthwaite , Geoffrey J. Hulten , Derek Hazeur
发明人： Joshua T. Goodman , Robert L. Rounthwaite , Geoffrey J. Hulten , Derek Hazeur
IPC分类号： G06F15/173
CPC分类号： G06Q10/107 , H04L51/12
摘要： The subject invention provides for an intelligent quarantining system and method that facilitates a more robust classification system in connection with spam prevention. The invention involves holding back some messages that appear to be questionable, suspicious, or untrustworthy from classification (as spam or good). In particular, the filter lacks information about these messages and thus classification is temporarily delayed. This provides more time for a filter update to arrive with a more accurate classification. The suspicious messages can be quarantined for a determined time period to allow more data to be collected regarding these messages. A number of factors can be employed to determine whether messages are more likely to be flagged for further analysis. User feedback by way of a feedback loop system can also be utilized to facilitate classification of the messages. After some time period, classification of the messages can be resumed.
摘要翻译：本发明提供了一种智能隔离系统和方法，其有助于与防止垃圾邮件相关联的更强大的分类系统。本发明涉及阻止一些似乎是有疑问的，可疑的或不可分类的消息（作为垃圾邮件或好的）。特别地，过滤器缺少关于这些消息的信息，因此分类被暂时延迟。这样可以提供更多的时间来进行更新，以更精确的分类。可疑邮件可以隔离一段确定的时间段，以便收集有关这些邮件的更多数据。可以采用许多因素来确定消息是否更有可能标记为进一步分析。通过反馈回路系统的用户反馈也可以用来促进消息的分类。一段时间后，可以恢复消息分类。

10. 发明授权

US07634810B2 Phishing detection, prevention, and notification 有权
标题翻译：网路钓鱼检测，预防和通知
公开(公告)号：US07634810B2
公开(公告)日：2009-12-15
申请号：US11129222
申请日：2005-05-13
申请人： Joshua T. Goodman , Paul S Rehfuss , Robert L. Rounthwaite , Manav Mishra , Geoffrey J Hulten , Kenneth G Richards , Aaron H Averbuch , Anthony P. Penta , Roderic C Deyo
发明人： Joshua T. Goodman , Paul S Rehfuss , Robert L. Rounthwaite , Manav Mishra , Geoffrey J Hulten , Kenneth G Richards , Aaron H Averbuch , Anthony P. Penta , Roderic C Deyo
IPC分类号： H04L29/06 , G06F21/00
CPC分类号： H04L63/1416 , H04L51/12 , H04L63/1466 , H04L63/1483
摘要： Phishing detection, prevention, and notification is described. In an embodiment, a messaging application facilitates communication via a messaging user interface, and receives a communication, such as an email message, from a domain. A phishing detection module detects a phishing attack in the communication by determining that the domain is similar to a known phishing domain, or by detecting suspicious network properties of the domain. In another embodiment, a Web browsing application receives content, such as data for a Web page, from a network-based resource, such as a Web site or domain. The Web browsing application initiates a display of the content, and a phishing detection module detects a phishing attack in the content by determining that a domain of the network-based resource is similar to a known phishing domain, or that an address of the network-based resource from which the content is received has suspicious network properties.
摘要翻译：描述网络钓鱼检测，预防和通知。在一个实施例中，消息收发应用促进通过消息收发用户界面的通信，并从域接收诸如电子邮件消息之类的通信。钓鱼检测模块通过确定域与已知的网络钓鱼域相似，或通过检测域的可疑网络属性来检测通信中的网络钓鱼攻击。在另一个实施例中，Web浏览应用程序从基于网络的资源（诸如网站或域）接收诸如网页的数据的内容。 Web浏览应用程序启动内容的显示，并且网络钓鱼检测模块通过确定基于网络的资源的域类似于已知的网络钓鱼域来检测内容中的网络钓鱼攻击，或者网络 - 收到内容的基于资源的资源具有可疑的网络属性。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式