会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 44. 发明申请
    • SYSTEM AND METHOD FOR SCALABLE COST-SENSITIVE LEARNING
    • 可衡量敏感性学习的系统和方法
    • US20100169252A1
    • 2010-07-01
    • US12690502
    • 2010-01-20
    • Wei FanHaixun WangPhilip S. Yu
    • Wei FanHaixun WangPhilip S. Yu
    • G06N3/12G06F15/18
    • G06N99/005
    • A method (and structure) for processing an inductive learning model for a dataset of examples, includes dividing the dataset of examples into a plurality of subsets of data and generating, using a processor on a computer, a learning model using examples of a first subset of data of the plurality of subsets of data. The learning model being generated for the first subset comprises an initial stage of an evolving aggregate learning model (ensemble model) for an entirety of the dataset, the ensemble model thereby providing an evolving estimated learning model for the entirety of the dataset if all the subsets were to be processed. The generating of the learning model using data from a subset includes calculating a value for at least one parameter that provides an objective indication of an adequacy of a current stage of the ensemble model.
    • 一种用于处理实例的数据集的感应学习模型的方法(和结构),包括将示例的数据集划分成多个数据子集,并使用计算机上的处理器生成使用第一子集的示例的学习模型 的多个数据子集的数据。 为第一子集生成的学习模型包括用于整个数据集的演进聚合学习模型(集合模型)的初始阶段,从而为整个数据集提供演进的估计学习模型,如果所有子集 被处理。 使用来自子集的数据生成学习模型包括计算至少一个参数的值,所述参数提供对所述集合模型的当前阶段的充分性的客观指示。
    • 45. 发明授权
    • System and method for tree structure indexing that provides at least one constraint sequence to preserve query-equivalence between xml document structure match and subsequence match
    • 用于树结构索引的系统和方法,其提供至少一个约束序列以保持xml文档结构匹配和子序列匹配之间的查询等价
    • US07475070B2
    • 2009-01-06
    • US11035889
    • 2005-01-14
    • Wei FanHaixun WangPhilip S. Yu
    • Wei FanHaixun WangPhilip S. Yu
    • G06F17/30G06F17/00
    • G06F17/30935Y10S707/99933Y10S707/99936
    • Sequence-based XML indexing aims at avoiding expensive join operations in query processing. It transforms structured XML data into sequences so that a structured query can be answered holistically through subsequence matching. Herein, there is addressed the problem of query equivalence with respect to this transformation, and thereis introduced a performance-oriented principle for sequencing tree structures. With query equivalence, XML queries can be performed through subsequence matching without join operations, post-processing, or other special handling for problems such as false alarms. There is identified a class of sequencing methods for this purpose, and there is presented a novel subsequence matching algorithm that observe query equivalence. Also introduced is a performance-oriented principle to guide the sequencing of tree structures. For any given XML dataset, the principle finds an optimal sequencing strategy according to its schema and its data distribution; there is thus presented herein a novel method that realizes this principle.
    • 基于序列的XML索引旨在避免查询处理中的昂贵的联接操作。 它将结构化XML数据转换为序列,以便可以通过子序列匹配整体回答结构化查询。 这里,针对这种转换的查询等价问题,提出了一种用于排序树结构的性能导向原理。 通过查询等价,可以通过子序列匹配执行XML查询,无需连接操作,后处理或其他特殊处理,例如虚假警报等问题。 确定了一类用于此目的的测序方法,并提出了一种观察查询等价性的新颖的子序列匹配算法。 还引入了一种以性能为导向的原则来指导树结构的排序。 对于任何给定的XML数据集,该原理根据其模式及其数据分布找到最佳排序策略; 因此在此呈现了实现这一原理的新颖方法。
    • 46. 发明授权
    • System and method for continuous diagnosis of data streams
    • 用于连续诊断数据流的系统和方法
    • US07464068B2
    • 2008-12-09
    • US10880913
    • 2004-06-30
    • Wei FanHaixun WangPhilip S. Yu
    • Wei FanHaixun WangPhilip S. Yu
    • G06F17/30
    • G06F17/30017G06F2216/03Y10S707/99931Y10S707/99935
    • In connection with the mining of time-evolving data streams, a general framework that mines changes and reconstructs models from a data stream with unlabeled instances or a limited number of labeled instances. In particular, there are defined herein statistical profiling methods that extend a classification tree in order to guess the percentage of drifts in the data stream without any labelled data. Exact error can be estimated by actively sampling a small number of true labels. If the estimated error is significantly higher than empirical expectations, there preferably re-sampled a small number of true labels to reconstruct the decision tree from the leaf node level.
    • 与挖掘时间不断变化的数据流有关的一般框架,即从具有未标记实例的数据流或有限数量的标记实例中挖掘变更和重建模型。 特别地,这里定义了扩展分类树的统计分析方法,以便在没有任何标记数据的情况下猜测数据流中漂移的百分比。 可以通过主动抽取少量真实标签来估计精确误差。 如果估计的误差明显高于经验期望值,则最好重新采样少量的真实标签,以从叶节点级别重建决策树。
    • 48. 发明授权
    • System and methods for anomaly detection and adaptive learning
    • 异常检测和自适应学习的系统和方法
    • US07424619B1
    • 2008-09-09
    • US10269694
    • 2002-10-11
    • Wei FanSalvatore J. Stolfo
    • Wei FanSalvatore J. Stolfo
    • G06F11/30G06F11/00G06F17/30H04L9/32H04L12/56H04L12/26H04L12/24G05B13/02
    • H04L63/1425G06F21/552H04L41/16H04L43/00
    • In a method of generating an anomaly detection model for classifying activities of a computer system, using a training set of data corresponding to activity on the computer system, the training set comprising a plurality of instances of data having features, and wherein each feature in said plurality of features has a plurality of values. For a selected feature and a selected value of the selected feature, a quantity is determined which corresponds to the relative sparsity of such value. The quantity may correspond to the difference between the number occurrences of the selected value and the number of occurrences of the most frequently occurring value. These instances are classified as anomaly and added to the training set of normal data to generate a rule set or other detection model.
    • 在产生用于对计算机系统的活动进行分类的异常检测模型的方法中,使用与计算机系统上的活动相对应的数据的训练集合,所述训练集合包括具有特征的多个数据实例,并且其中所述 多个特征具有多个值。 对于所选特征和所选特征的选定值,确定与该值相对稀疏度对应的数量。 数量可以对应于所选值的出现次数与最常发生值的出现次数之间的差异。 这些实例被分类为异常,并添加到正常数据的训练集中以生成规则集或其他检测模型。
    • 49. 发明授权
    • System and method for indexing weighted-sequences in large databases
    • 用于索引大数据库中加权序列的系统和方法
    • US07418455B2
    • 2008-08-26
    • US10723229
    • 2003-11-26
    • Wei FanChang-Shing PerngHaixun WangPhilip Shi-Lung Yu
    • Wei FanChang-Shing PerngHaixun WangPhilip Shi-Lung Yu
    • G06F7/00G06F17/00
    • G06F17/30327G06F17/30548Y10S707/99943
    • The present invention provides an index structure for managing weighted-sequences in large databases. A weighted-sequence is defined as a two-dimensional structure in which each element in the sequence is associated with a weight. A series of network events, for instance, is a weighted-sequence because each event is associated with a timestamp. Querying a large sequence database by events' occurrence patterns is a first step towards understanding the temporal causal relationships among the events. The index structure proposed herein enables the efficient retrieval from the database of all subsequences (contiguous and non-contiguous) that match a given query sequence both by events and by weights. The index structure also takes into consideration the nonuniform frequency distribution of events in the sequence data.
    • 本发明提供了一种用于在大数据库中管理加权序列的索引结构。 加权序列被定义为二维结构,其中序列中的每个元素与权重相关联。 例如,一系列网络事件是加权序列,因为每个事件都与时间戳相关联。 通过事件发生模式查询大序列数据库是了解事件之间的时间因果关系的第一步。 这里提出的索引结构使得能够通过事件和权重从数据库有效地检索与给定查询序列匹配的所有子序列(连续的和不连续的)。 索引结构还考虑了序列数据中事件的不均匀频率分布。