会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明申请
    • Partitioning of data mining training set
    • 数据挖掘训练集分区
    • US20070214135A1
    • 2007-09-13
    • US11371477
    • 2006-03-09
    • Ioan CrivatRaman IyerC. MdcLennan
    • Ioan CrivatRaman IyerC. MdcLennan
    • G06F17/30
    • G06F17/30539
    • A system that effectuates fetching a complete set of relational data into a mining services server and subsequently defining desired partitions upon the fetched data is provided. In accordance with the innovation, the data can be locally cached and partitioned therefrom. Accordingly, upon the same mining structure (e.g., cache) that has been partitioned, the novel innovation can build mining models for each partition. In other words, the innovation can employ the concept of mining structure as a data cache while manipulating only partitions of this cache in certain operations. The innovation can be employed in scenarios where a user wants to train a mining model using only data points that satisfy a particular Boolean condition, a user wants to split the training set into multiple partitions (e.g., training/testing) and/or a user wants to perform a data mining procedure known as “N-fold cross validation.”
    • 提供了一种能够将完整的关系数据集提取到采矿服务服务器中并随后在获取的数据上定义所需分区的系统。 根据创新,数据可以被本地缓存并从中分割。 因此,在已经被划分的相同挖掘结构(例如,高速缓存)上,新颖的创新可以为每个分区建立挖掘模型。 换句话说,创新可以采用挖掘结构的概念作为数据高速缓存,同时在某些操作中仅操纵该高速缓存的分区。 该创新可以在用户想要仅使用满足特定布尔条件的数据点来训练挖掘模型的情况下使用,用户希望将训练集合分成多个分区(例如,训练/测试)和/或用户 想要执行称为“N-fold交叉验证”的数据挖掘过程。