会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明授权
    • Method and architecture for automated optimization of ETL throughput in data warehousing applications
    • 在数据仓库应用程序中自动优化ETL吞吐量的方法和架构
    • US06208990B1
    • 2001-03-27
    • US09116426
    • 1998-07-15
    • Sankaran SureshJyotindra Pramathnath GautamGirish PanchaFrank Joseph DeRoseMohan Sankaran
    • Sankaran SureshJyotindra Pramathnath GautamGirish PanchaFrank Joseph DeRoseMohan Sankaran
    • G06F1730
    • G06F17/30563Y10S707/99936Y10S707/99943Y10S707/99945
    • A computer software architecture to automatically optimize the throughput of the data extraction/transformation/loading (ETL) process in data warehousing applications. This architecture has a componentized aspect and a pipeline-based aspect. The componentized aspect refers to the fact that every transformation used in this architecture is built up with transformation components selected from an extensible set of transformation components. Besides simplifying source code maintenance and adjustment for the data warehouse users, these transformation components also provide these users the building blocks to effectively construct pertinent and functionally sophisticated transformations in a pipelined manner. Within a pipeline, each transformation component automatically stages or streams its data to optimize ETL throughput. Furthermore, each transformation either pushes data to another transformation component, pulls data from another transformation component, or performs a push/pull operation on the data. Thereby, the pipelining; staging/streaming; and pushing/pulling features of the transformation components effectively optimizes the throughput of the ETL process.
    • 一种计算机软件架构,用于自动优化数据仓库应用程序中数据提取/转换/加载(ETL)流程的吞吐量。 该架构具有组件化方面和基于流水线的方面。 组件化方面是指在该架构中使用的每个变换都是由可扩展的转换组件集合中选择的转换组件构建的。 除了简化数据仓库用户的源代码维护和调整外,这些转换组件还为这些用户提供了以流水线方式有效构建相关和功能复杂的转换的构建块。 在管道中,每个转换组件自动对其数据进行排序或流式传输,以优化ETL吞吐量。 此外,每个变换将数据推送到另一个变换组件,从另一变换组件中提取数据,或对数据执行推/拉操作。 因此,流水线; 分段/流式传输 并且转换组件的推/拉功能有效地优化了ETL过程的吞吐量。
    • 3. 发明授权
    • Method for incremental aggregation of dynamically increasing database
data sets
    • 动态增加数据库数据集的增量聚合方法
    • US5794246A
    • 1998-08-11
    • US846934
    • 1997-04-30
    • Mohan SankaranSankaran SureshMon WongDiaz Nesamoney
    • Mohan SankaranSankaran SureshMon WongDiaz Nesamoney
    • G06F17/30G06F15/00
    • G06F17/30412G06F17/30592Y10S707/99933Y10S707/99934Y10S707/99937Y10S707/99942
    • A method of performing incremental aggregation of dynamically increasing database data sets. An embodiment of the present invention operates within a data mart or data warehouse to aggregate data stored within an operational database corresponding to newly received data to provide current information. Initially, a computer server creates an intermediate file which is initialized by the server with an aggregate data set. The aggregate data set consists of data values and count values that each correspond to specific group identifiers. The computer determines if any group identifiers within a new set of inputs data are identical to any group identifiers stored within the intermediate file. If an inputted group identifier matches a stored group identifier, the inputted data value is aggregated with the stored data value and the count value corresponding to the specific stored group identifier is incremented by one. If an inputted group identifier does not match any of the stored group identifiers, the inputted group identifier and corresponding data value are stored within the intermediate file and a count value of one is appended to that specific group identifier. Once all the group identifiers within the new set of input data have been determined, the computer stores all the changes that were made to the intermediate file into the aggregate data set.
    • 执行动态增加的数据库数据集的增量聚合的方法。 本发明的一个实施例在数据集市或数据仓库内操作以聚集存储在与新接收的数据相对应的操作数据库内的数据,以提供当前信息。 最初,计算机服务器创建由服务器用聚合数据集初始化的中间文件。 聚合数据集由数据值和计数值组成,每个值对应于特定的组标识符。 计算机确定新组输入数据中的任何组标识符是否与存储在中间文件中的任何组标识符相同。 如果输入的组标识符与存储的组标识符匹配,则将输入的数据值与存储的数据值进行聚合,并将与特定存储的组标识符相对应的计数值增加1。 如果输入的组标识符与任何存储的组标识符不匹配,则输入的组标识符和对应的数据值被存储在中间文件中,并且将一个计数值附加到该特定组标识符。 一旦确定了新组输入数据中的所有组标识符,计算机将对中间文件所做的所有更改存储到聚合数据集中。
    • 4. 发明授权
    • Apparatus and method for capturing and propagating changes from an
operational database to data marts
    • 用于从操作数据库捕获和传播变化到数据集市的装置和方法
    • US06032158A
    • 2000-02-29
    • US850490
    • 1997-05-02
    • Pinaki MukhopadhyayDiaz NesamoneyMohan SankaranSankaran SureshSanjeev K. Gupta
    • Pinaki MukhopadhyayDiaz NesamoneyMohan SankaranSankaran SureshSanjeev K. Gupta
    • G06F17/30
    • G06F17/30563G06F17/30345G06F17/30371G06F17/30592Y10S707/99952
    • A method for updating a target table of a data mart in response to changes made by a transaction to data stored in a source table of an operational database. Data that was changed in the source table by the transaction is stored in a dynamic image table of a change capture database. Data that was not changed in the source table by the transaction, but which is nevertheless required to be mapped to the target table, is stored in a static image table of the change capture database. The change capture database also contains relevant information regarding the transaction. Once the dynamic and static image tables are properly staged, the changes are propagated from the change capture database to the target tables of the data marts. In other words, data is extracted from the change capture database and subsequently transformed and loaded, thereby minimizing the impact to the operational database. Thereupon, the tables of the change capture database are truncated to discard data which is now no longer needed.
    • 一种用于响应于由事务对存储在操作数据库的源表中的数据所做的改变来更新数据集市的目标表的方法。 由事务在源表中更改的数据存储在更改捕获数据库的动态映像表中。 在事务处理源表中未被更改但仍需要映射到目标表的数据存储在变更捕获数据库的静态映像表中。 更改捕获数据库还包含有关事务的相关信息。 一旦动态和静态映像表被正确地分级,更改将从更改捕获数据库传播到数据集市的目标表。 换句话说,从变更捕获数据库中提取数据,随后进行变换和加载,从而最小化对操作数据库的影响。 因此,更改捕获数据库的表被截断以丢弃现在不再需要的数据。