专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

61. 发明授权

US08010541B2 Systems and methods for condensation-based privacy in strings 失效
标题翻译：字符串中基于冷凝的隐私的系统和方法
公开(公告)号：US08010541B2
公开(公告)日：2011-08-30
申请号：US11540406
申请日：2006-09-30
申请人： Charu C. Aggarwal , Philip S. Yu
发明人： Charu C. Aggarwal , Philip S. Yu
IPC分类号： G06F17/30
CPC分类号： G06F21/6245
摘要： Novel methods and systems for the privacy preserving mining of string data with the use of simple template based models. Such template based models are effective in practice, and preserve important statistical characteristics of the strings such as intra-record distances. Discussed herein is the condensation model for anonymization of string data. Summary statistics are created for groups of strings, and use these statistics are used to generate pseudo-strings. It will be seen that the aggregate behavior of a new set of strings maintains key characteristics such as composition, the order of the intra-string distances, and the accuracy of data mining algorithms such as classification. The preservation of intra-string distances is a key goal in many string and biological applications which are deeply dependent upon the computation of such distances, while it can be shown that the accuracy of applications such as classification are not affected by the anonymization process.
摘要翻译：使用简单的基于模板的模型，用于隐私保护字符串数据挖掘的新方法和系统。这种基于模板的模型在实践中是有效的，并且保持字符串的重要统计特征，例如记录内距离。这里讨论的是字符串数据的匿名化的缩合模型。针对字符串组创建摘要统计信息，并使用这些统计信息来生成伪字符串。可以看出，一组新的字符串的聚合行为保持关键特征，例如组合，字符串间距离的顺序以及诸如分类的数据挖掘算法的准确性。字符串间距离的保留是许多字符串和生物应用中的关键目标，这些应用程序深深地依赖于这种距离的计算，而可以显示诸如分类的应用的准确性不受匿名过程的影响。

62. 发明授权

US07890294B2 Systems for structural clustering of time sequences 有权
标题翻译：时间序列结构聚类系统
公开(公告)号：US07890294B2
公开(公告)日：2011-02-15
申请号：US12115166
申请日：2008-05-05
申请人： Vittorio Castelli , Michail Vlachos , Philip S. Yu
发明人： Vittorio Castelli , Michail Vlachos , Philip S. Yu
IPC分类号： G06F15/00
CPC分类号： G06K9/00523 , Y10S706/90 , Y10S707/99936 , Y10S707/99937 , Y10S707/99945 , Y10S707/99953
摘要： Arrangements are provided for performing structural clustering between different time series. Time series data relating to a plurality of time series is accepted, structural features relating to the time series data are ascertained, and at least one distance between different time series via employing the structural features is determined. The different time series may be partitioned into clusters based on the at least one distance, and/or the k closest matches to a given time series query based on the at least one distance may be returned.
摘要翻译：提供了在不同时间序列之间进行结构聚类的安排。接收与多个时间序列相关的时间序列数据，确定与时间序列数据相关的结构特征，并且确定通过采用结构特征的不同时间序列之间的至少一个距离。可以基于至少一个距离将不同的时间序列划分成簇，并且可以返回基于至少一个距离的/或与给定时间序列查询的k个最接近的匹配。

63. 发明授权

US07853545B2 Preserving privacy of one-dimensional data streams using dynamic correlations 失效
标题翻译：使用动态相关性保护一维数据流的隐私
公开(公告)号：US07853545B2
公开(公告)日：2010-12-14
申请号：US11678786
申请日：2007-02-26
申请人： Yuan-Chi Chang , Feifei Li , Spyridon Papadimitriou , George A. Mihaila , Ioana Stanoi , Jimeng Sun , Philip S. Yu
发明人： Yuan-Chi Chang , Feifei Li , Spyridon Papadimitriou , George A. Mihaila , Ioana Stanoi , Jimeng Sun , Philip S. Yu
IPC分类号： G06F17/00
CPC分类号： H04L63/0407 , G06F21/6245 , G06F2207/7219
摘要： Disclosed is a method, information processing system, and computer readable medium for preserving privacy of nonstationary data streams. The method includes receiving at least one nonstationary data stream with time dependent data. Calculating, for a given instant of sub-space of time, A set of first-moment statistical values is calculated, for a given instant of sub-space of time, for the data. The first moment statistical values include a principal component for the sub-space of time. The data is perturbed with noise along the principal component in proportion to the first-moment of statistical values so that at least part of a set of second-moment statistical values for the data is perturbed by the noise only within a predetermined variance.
摘要翻译：公开了一种用于保持非平稳数据流的隐私的方法，信息处理系统和计算机可读介质。该方法包括接收具有时间相关数据的至少一个非平稳数据流。对于给定时间子空间的计算，对于数据的子时间空间的给定时刻，计算一组一阶统计值。第一时刻统计值包括时间子空间的主成分。数据按照与主要分量成比例的噪声与第一时刻的统计值相互扰动，使得数据的至少一部分二阶统计值仅在预定方差内被噪声扰动。

64. 发明申请

US20100169252A1 SYSTEM AND METHOD FOR SCALABLE COST-SENSITIVE LEARNING 有权
标题翻译：可衡量敏感性学习的系统和方法
公开(公告)号：US20100169252A1
公开(公告)日：2010-07-01
申请号：US12690502
申请日：2010-01-20
申请人： Wei Fan , Haixun Wang , Philip S. Yu
发明人： Wei Fan , Haixun Wang , Philip S. Yu
IPC分类号： G06N3/12 , G06F15/18
CPC分类号： G06N99/005
摘要： A method (and structure) for processing an inductive learning model for a dataset of examples, includes dividing the dataset of examples into a plurality of subsets of data and generating, using a processor on a computer, a learning model using examples of a first subset of data of the plurality of subsets of data. The learning model being generated for the first subset comprises an initial stage of an evolving aggregate learning model (ensemble model) for an entirety of the dataset, the ensemble model thereby providing an evolving estimated learning model for the entirety of the dataset if all the subsets were to be processed. The generating of the learning model using data from a subset includes calculating a value for at least one parameter that provides an objective indication of an adequacy of a current stage of the ensemble model.
摘要翻译：一种用于处理实例的数据集的感应学习模型的方法（和结构），包括将示例的数据集划分成多个数据子集，并使用计算机上的处理器生成使用第一子集的示例的学习模型的多个数据子集的数据。为第一子集生成的学习模型包括用于整个数据集的演进聚合学习模型（集合模型）的初始阶段，从而为整个数据集提供演进的估计学习模型，如果所有子集被处理。使用来自子集的数据生成学习模型包括计算至少一个参数的值，所述参数提供对所述集合模型的当前阶段的充分性的客观指示。

65. 发明申请

US20090074043A1 RESOURCE ADAPTIVE SPECTRUM ESTIMATION OF STREAMING DATA 有权
标题翻译：资源自适应频谱估计数据流
公开(公告)号：US20090074043A1
公开(公告)日：2009-03-19
申请号：US12177300
申请日：2008-07-22
申请人： Deepak Srinivac Turaga , Michail Vlachos , Philip S. Yu
发明人： Deepak Srinivac Turaga , Michail Vlachos , Philip S. Yu
IPC分类号： H04B17/00
CPC分类号： G06F17/141
摘要： Streaming environments typically dictate incomplete or approximate algorithm execution, in order to cope with sudden surges in the data rate. Such limitations are even more accentuated in mobile environments (such as sensor networks) where computational and memory resources are typically limited. Introduced herein is a novel “resource adaptive” algorithm for spectrum and periodicity estimation on a continuous stream of data. The formulation is based on the derivation of a closed-form incremental computation of the spectrum, augmented by an intelligent load-shedding scheme that can adapt to available CPU resources. Experimentation indicates that the proposed technique can be a viable and resource efficient solution for real-time spectrum estimation.
摘要翻译：流环境通常会指示不完整或近似算法执行，以应对数据速率的突然增加。在计算和存储资源通常受限制的移动环境（如传感器网络）中，这种限制更加突出。这里介绍的是一种用于连续数据流的频谱和周期估计的新型“资源自适应”算法。该公式基于频谱的闭合增量计算的推导，通过可以适应可用CPU资源的智能加载开放方案来增强。实验表明，提出的技术可以成为实时频谱估计的可行且资源有效的解决方案。

66. 发明申请

US20090060095A1 METHODS, APPARATUSES, AND COMPUTER PROGRAM PRODUCTS FOR CLASSIFYING UNCERTAIN DATA 失效
标题翻译：用于分类不确定数据的方法，装置和计算机程序产品
公开(公告)号：US20090060095A1
公开(公告)日：2009-03-05
申请号：US11846004
申请日：2007-08-28
申请人： Charu Aggarwal , Philip S. Yu
发明人： Charu Aggarwal , Philip S. Yu
IPC分类号： H03D1/00 , H04L27/06
CPC分类号： G06K9/6226 , G06K9/6259
摘要： Uncertain data is classified by constructing an error adjusted probability density estimate for the data, and applying a subspace exploration process to the probability density estimate to classify the data.
摘要翻译：通过构建数据的误差调整概率密度估计和将子空间探索过程应用于概率密度估计来对数据进行分类来分类不确定的数据。

67. 发明授权

US07475070B2 System and method for tree structure indexing that provides at least one constraint sequence to preserve query-equivalence between xml document structure match and subsequence match 失效
标题翻译：用于树结构索引的系统和方法，其提供至少一个约束序列以保持xml文档结构匹配和子序列匹配之间的查询等价
公开(公告)号：US07475070B2
公开(公告)日：2009-01-06
申请号：US11035889
申请日：2005-01-14
申请人： Wei Fan , Haixun Wang , Philip S. Yu
发明人： Wei Fan , Haixun Wang , Philip S. Yu
IPC分类号： G06F17/30 , G06F17/00
CPC分类号： G06F17/30935 , Y10S707/99933 , Y10S707/99936
摘要： Sequence-based XML indexing aims at avoiding expensive join operations in query processing. It transforms structured XML data into sequences so that a structured query can be answered holistically through subsequence matching. Herein, there is addressed the problem of query equivalence with respect to this transformation, and thereis introduced a performance-oriented principle for sequencing tree structures. With query equivalence, XML queries can be performed through subsequence matching without join operations, post-processing, or other special handling for problems such as false alarms. There is identified a class of sequencing methods for this purpose, and there is presented a novel subsequence matching algorithm that observe query equivalence. Also introduced is a performance-oriented principle to guide the sequencing of tree structures. For any given XML dataset, the principle finds an optimal sequencing strategy according to its schema and its data distribution; there is thus presented herein a novel method that realizes this principle.
摘要翻译：基于序列的XML索引旨在避免查询处理中的昂贵的联接操作。它将结构化XML数据转换为序列，以便可以通过子序列匹配整体回答结构化查询。这里，针对这种转换的查询等价问题，提出了一种用于排序树结构的性能导向原理。通过查询等价，可以通过子序列匹配执行XML查询，无需连接操作，后处理或其他特殊处理，例如虚假警报等问题。确定了一类用于此目的的测序方法，并提出了一种观察查询等价性的新颖的子序列匹配算法。还引入了一种以性能为导向的原则来指导树结构的排序。对于任何给定的XML数据集，该原理根据其模式及其数据分布找到最佳排序策略; 因此在此呈现了实现这一原理的新颖方法。

68. 发明授权

US07464068B2 System and method for continuous diagnosis of data streams 失效
标题翻译：用于连续诊断数据流的系统和方法
公开(公告)号：US07464068B2
公开(公告)日：2008-12-09
申请号：US10880913
申请日：2004-06-30
申请人： Wei Fan , Haixun Wang , Philip S. Yu
发明人： Wei Fan , Haixun Wang , Philip S. Yu
IPC分类号： G06F17/30
CPC分类号： G06F17/30017 , G06F2216/03 , Y10S707/99931 , Y10S707/99935
摘要： In connection with the mining of time-evolving data streams, a general framework that mines changes and reconstructs models from a data stream with unlabeled instances or a limited number of labeled instances. In particular, there are defined herein statistical profiling methods that extend a classification tree in order to guess the percentage of drifts in the data stream without any labelled data. Exact error can be estimated by actively sampling a small number of true labels. If the estimated error is significantly higher than empirical expectations, there preferably re-sampled a small number of true labels to reconstruct the decision tree from the leaf node level.
摘要翻译：与挖掘时间不断变化的数据流有关的一般框架，即从具有未标记实例的数据流或有限数量的标记实例中挖掘变更和重建模型。特别地，这里定义了扩展分类树的统计分析方法，以便在没有任何标记数据的情况下猜测数据流中漂移的百分比。可以通过主动抽取少量真实标签来估计精确误差。如果估计的误差明显高于经验期望值，则最好重新采样少量的真实标签，以从叶节点级别重建决策树。

69. 发明申请

US20080270640A1 METHOD AND APPARATUS FOR ADAPTIVE IN-OPERATOR LOAD SHEDDING 失效
标题翻译：自适应操作员负载分离的方法和装置
公开(公告)号：US20080270640A1
公开(公告)日：2008-10-30
申请号：US12164671
申请日：2008-06-30
申请人： BUGRA GEDIK , Kun-Lung Wu , Philip S. Yu
发明人： BUGRA GEDIK , Kun-Lung Wu , Philip S. Yu
IPC分类号： G06F3/00
CPC分类号： H04L67/10 , H02J3/14 , H04L47/10 , H04L47/225 , H04L47/41 , H04L49/90 , H04L49/901 , Y02D50/30 , Y04S20/224
摘要： One embodiment of the present method and apparatus adaptive in-operator load shedding includes receiving at least two data streams (each comprising a plurality of tuples, or data items) into respective sliding windows of memory. A throttling fraction is then calculated based on input rates associated with the data streams and on currently available processing resources. Tuples are then selected for processing from the data streams in accordance with the throttling fraction, where the selected tuples represent a subset of all tuples contained within the sliding window.
摘要翻译：本发明的方法和设备的一个实施例是自适应操作员卸载包括将至少两个数据流（每个包括多个元组或数据项）接收到存储器的相应滑动窗口中。然后基于与数据流相关联的输入速率和当前可用的处理资源来计算节流分数。然后根据节流分数从数据流中选择元组进行处理，其中所选元组表示包含在滑动窗口内的所有元组的子集。

70. 发明申请

US20080205641A1 PRESERVING PRIVACY OF ONE-DIMENSIONAL DATA STREAMS USING DYNAMIC AUTOCORRELATION 失效
标题翻译：使用动态自动保存保护一维数据流的隐私
公开(公告)号：US20080205641A1
公开(公告)日：2008-08-28
申请号：US11678808
申请日：2007-02-26
申请人： Yuan-Chi Chang , Feifei Li , Spyridon Papadimitriou , George A. Mihaila , Ioana Stanoi , Jimeng Sun , Philip S. Yu
发明人： Yuan-Chi Chang , Feifei Li , Spyridon Papadimitriou , George A. Mihaila , Ioana Stanoi , Jimeng Sun , Philip S. Yu
IPC分类号： H04L9/18
CPC分类号： G06F21/755
摘要： A method, information processing system, and computer readable medium are provided for preserving privacy of one-dimensional nonstationary data streams. The method includes receiving a one-dimensional nonstationary data stream. A set of first-moment statistical values are calculated, for a given instant of sub-space of time, for the data. The first moment statistical values include a principal component for the sub-space of time. The data is perturbed with noise along the principal component in proportion to the first-moment of statistical values so that at least part of a set of second-moment statistical values for the data is perturbed by the noise only within a predetermined variance.
摘要翻译：提供了一种方法，信息处理系统和计算机可读介质，用于保持一维非平稳数据流的隐私。该方法包括接收一维非平稳数据流。对于数据的子时间空间的给定时刻，计算一组一阶统计值。第一时刻统计值包括时间子空间的主成分。数据按照与主要分量成比例的噪声与第一时刻的统计值相互扰动，使得数据的至少一部分二维统计值仅在预定方差内被噪声扰动。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式