会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • System load based adaptive prefetch
    • 基于系统负载的自适应预取
    • US07359890B1
    • 2008-04-15
    • US10142257
    • 2002-05-08
    • Chi KuArvind NithrakashyapAri W. Mozes
    • Chi KuArvind NithrakashyapAri W. Mozes
    • G06F17/30
    • G06F17/3048Y10S707/99932
    • A number, of the blocks of data to be prefetched into a buffer cache, is determined dynamically at run time (e.g. during execution of a query), based at least in part on the load placed on the buffer cache. An application program (such as a database) is responsive to the number (also called “prefetch size”), to determine the amount of prefetching. A sequence of instructions (also called “prefetch size daemon”) computes the prefetch size based on, for example, the number of prefetched blocks aged out before use. The prefetch size daemon dynamically revises the prefetch size based on usage of the buffer cache, thereby to form a feedback loop. Depending on the embodiment, at times of excessive use of the buffer cache, prefetching may even be turned off. Although in one embodiment described herein the prefetch size daemon is implemented in a database, in other embodiments other kinds of applications and/or the operating system itself can use a prefetch size daemon of the type described herein to dynamically determine and change prefetch behavior.
    • 至少部分地基于放置在缓冲器高速缓存上的负载,在运行时(例如在执行查询期间)动态地确定要预取到缓冲器高速缓存中的数据块的数量。 应用程序(例如数据库)响应于数字(也称为“预取大小”)来确定预取量。 指令序列(也称为“预取大小守护程序”)基于例如在使用之前老化的预取块的数量来计算预取大小。 预取大小守护程序根据缓冲区高速缓存的使用动态地修改预取大小,从而形成一个反馈循环。 根据实施例,在过度使用缓冲器高速缓存的时候,预取甚至可能被关闭。 尽管在本文描述的一个实施例中,预取大小守护程序在数据库中实现,但在其他实施例中,其他种类的应用和/或操作系统本身可以使用本文所述类型的预取大小守护程序动态地确定并改变预取行为。
    • 2. 发明授权
    • Binning predictors using per-predictor trees and MDL pruning
    • 使用每预测树和MDL修剪的binning预测变量
    • US08280915B2
    • 2012-10-02
    • US11344185
    • 2006-02-01
    • Mahesh JagannathChitra BhagwatJoseph YarmusAri W. Mozes
    • Mahesh JagannathChitra BhagwatJoseph YarmusAri W. Mozes
    • G06F7/00G06F17/30
    • G06K9/6282
    • Binning of predictor values used for generating a data mining model provides useful reduction in memory footprint and computation during the computationally dominant decision tree build phase, but reduces the information loss of the model and reduces the introduction of false information artifacts. A method of binning data in a database for data mining modeling in a database system, the data stored in a database table in the database system, the data mining modeling having selected at least one predictor and one target for the data, the data including a plurality of values of the predictor and a plurality of values of the target, the method comprises constructing a binary tree for the predictor that splits the values of the predictor into a plurality of portions, pruning the binary tree, and defining as bins of the predictor leaves of the tree that remain after pruning, each leaf of the tree representing a portion of the values of the predictor.
    • 用于生成数据挖掘模型的预测值的分组在计算主导的决策树构建阶段提供了有用的减少内存占用和计算,但减少了模型的信息丢失并减少了虚假信息工件的引入。 一种在数据库中对数据进行数据挖掘建模的方法,数据库系统中存储的数据库中存储的数据,数据挖掘建模已经为数据选择了至少一个预测因子和一个目标,数据包括 所述预测器的多个值和所述目标的多个值,所述方法包括为所述预测器构建二叉树,所述预测器将所述预测器的值分割成多个部分,修剪所述二叉树,并且将所述二叉树定义为所述预测器 修剪后保留的树的叶子,树的每个叶表示预测值的一部分值。
    • 5. 发明授权
    • Database index validation mechanism
    • 数据库索引验证机制
    • US07272589B1
    • 2007-09-18
    • US09703909
    • 2000-11-01
    • Todd P. GuayGregory S. SmithAri W. MozesGaylen D. Royal
    • Todd P. GuayGregory S. SmithAri W. MozesGaylen D. Royal
    • G06F17/30
    • G06F17/30312Y10S707/954Y10S707/99932Y10S707/99933Y10S707/99934Y10S707/99935
    • A method evaluates a plurality of candidate index sets for a workload of database statements in a database system by first generating baseline statistics for each statement in the workload. An index superset is formed by combining an existing or current index set and a proposed index set. A candidate index set is derived from the index superset, the candidate index being one of the plurality of candidate index sets. Statistics for a statement are generated by first creating an execution plan which represents an efficient series of steps for executing the statement given the candidate index set. The execution plan is evaluated, and statistics based on the evaluation of the execution plan are generated and recorded. The cost of the execution plan is then determined and statistics are generated. Statistics for each candidate index set are rolled up and presented to a user or an index tuning mechanism.
    • 一种方法通过首先为工作负载中的每个语句生成基准统计量来评估数据库系统中数据库语句的工作负荷的多个候选索引集。 索引超集是通过组合现有或当前索引集合和提出的索引集合来形成的。 候选索引集是从索引超集导出的,候选索引是多个候选索引集之一。 通过首先创建执行计划来生成语句的统计信息,该执行计划代表给出候选索引集的执行语句的有效的一系列步骤。 对执行计划进行评估,生成并记录基于执行计划的评估的统计。 然后确定执行计划的成本,并生成统计信息。 每个候选索引集的统计信息被卷起并呈现给用户或索引调整机制。
    • 6. 发明授权
    • Method and system for histogram determination in a database
    • 数据库中直方图确定的方法和系统
    • US06691099B1
    • 2004-02-10
    • US09872588
    • 2001-05-31
    • Ari W. Mozes
    • Ari W. Mozes
    • G06F1730
    • G06F17/30469G06F17/30306G06F17/30536Y10S707/99932Y10S707/99934
    • A method and system for determining when to collect, save, and/or utilize histograms is disclosed. A mechanism for automatically deciding when to collect histograms upon request from the user is provided. The histogram collection decision is based on the columns the user is interested in, the role these columns play in the queries as submitted to the system, and the underlying distribution for these columns, e.g., as seen in a random sample. The user specifies which columns are of interest, and the database is configured to collect column usage information that describes how each column is being used in the workload. This column usage information could be stored in memory and periodically flushed to disk. Given a set of potential columns, the distribution of those columns is viewed in combination with the usage information to determine which columns should have histograms.
    • 公开了一种用于确定何时收集,保存和/或利用直方图的方法和系统。 提供了一种用于根据用户请求自动决定何时收集直方图的机制。 直方图收集决定基于用户感兴趣的列,这些列在提交给系统的查询中的作用,以及这些列的底层分布,例如随机样本中所示。 用户指定哪些列是感兴趣的,并且数据库被配置为收集描述如何在工作负载中使用每列的列使用信息。 此列使用信息可以存储在内存中并定期刷新到磁盘。 给定一组潜在的列,这些列的分布与使用信息结合使用,以确定哪些列应具有直方图。
    • 7. 发明授权
    • System and method for building decision tree classifiers using bitmap techniques
    • 使用位图技术构建决策树分类器的系统和方法
    • US07571159B2
    • 2009-08-04
    • US11344193
    • 2006-02-01
    • Shiby ThomasWei LiJoseph YarmusMahesh JagannathAri W. Mozes
    • Shiby ThomasWei LiJoseph YarmusMahesh JagannathAri W. Mozes
    • G06F7/00G06F17/30G06F17/00
    • G06F17/30545Y10S707/99933Y10S707/99945
    • A method, system, and computer program product for counting predictor-target pairs for a decision tree model provides the capability to generate count tables that is quicker and more efficient than previous techniques. A method of counting predictor-target pairs for a decision tree model, the decision tree model based on data stored in a database, the data comprising a plurality of rows of data, at least one predictor and at least one target, comprises generating a bitmap for each split node of data stored in a database system by intersecting a parent node bitmap and a bitmap of a predictor that satisfies a condition of the node, intersecting each split node bitmap with each predictor bitmap and with each target bitmap to form intersected bitmaps, and counting bits of each intersected bitmap to generate a count of predictor-target pairs.
    • 用于计算决策树模型的预测器 - 目标对的方法,系统和计算机程序产品提供了生成比先前技术更快更有效的计数表的能力。 一种对决策树模型计算预测器 - 目标对的方法,基于存储在数据库中的数据的决策树模型,包括多行数据的数据,至少一个预测器和至少一个目标,包括生成位图 通过将父节点位图和满足该节点的条件的预测器的位图相交到数据库系统中存储的数据的每个分割节点,将每个分割节点位图与每个预测器位图相交,并与每个目标位图形成相交的位图, 并计数每个相交位图的位以产生预测器 - 目标对的计数。
    • 8. 发明授权
    • Frequent itemset counting using subsets of bitmaps
    • 频繁的项集计数使用位图子集
    • US07756853B2
    • 2010-07-13
    • US10927893
    • 2004-08-27
    • Wei LiAri W. MozesHakan Jakobsson
    • Wei LiAri W. MozesHakan Jakobsson
    • G06F7/00
    • G06F17/30539
    • A method and mechanism for performing improved frequent itemset operations is provided. A set of item groups are divided into a plurality of subsets. Each item group is composed of a set of data items. Possible combinations of data items that may frequently appear together in the same item group are referred to as candidate combinations. Candidate combinations comprising a first set of data items are identified, and thereafter the occurrence of each candidate combination in any item group in each subset is counted by comparing item bitmaps, associated with items in the candidate combination, in each subset in turn. The comparison of item bitmaps is performed in volatile memory. A total frequent itemset count that describes the frequency of candidate combinations in items groups across all subsets is obtained. Thereafter, the total frequent itemset count for candidate combinations having a larger number of data items may be determined.
    • 提供了一种用于执行改进的频繁项目集操作的方法和机制。 一组项目组被分成多个子集。 每个项目组由一组数据项组成。 可能经常出现在同一项目组中的数据项的可能组合被称为候选组合。 识别包括第一组数据项的候选组合,然后通过在每个子集中依次比较与候选组合中的项目相关联的项目位图来对每个子集中的任何项目组中的每个候选组合的出现进行计数。 项目位图的比较在易失性存储器中执行。 获得描述所有子集中的项目组中候选组合的频率的总频繁项集计数。 此后,可以确定具有较大数量数据项的候选组合的总频繁项目集计数。