会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 27. 发明申请
    • GENERATING DATA FROM IMBALANCED TRAINING DATA SETS
    • 从不平等的培训数据集生成数据
    • US20150088791A1
    • 2015-03-26
    • US14034797
    • 2013-09-24
    • International Business Machines Corporation
    • Ching-Yung LinWan-Yi LinYinglong Xia
    • G06N99/00
    • G06N99/005G06F17/50
    • Injecting generated data samples into a minority data class of an imbalanced training data set is provided. In response to receiving an input to balance the imbalanced training data set that includes a majority data class and the minority data class, a set of data samples is generated for the minority data class. A distance is calculated from each data sample in the set of generated data samples to a center of a kernel that includes a set of data samples of the majority data class. Each data sample in the set of generated data samples is stored within a corresponding distance score bucket based on the calculated distance of a data sample. Generated data samples are selected from a number of highest ranking distance score buckets. The generated data samples selected from the number of highest ranking distance score buckets are injected into the minority data class.
    • 提供了将生成的数据样本注入到不平衡训练数据集的少数数据类中。 响应于接收到输入以平衡包括多数数据类别和少数数据类别的不平衡训练数据集,为少数数据类生成一组数据样本。 将生成的数据样本集中的每个数据样本的距离计算到包含大多数数据类别的一组数据样本的内核中心。 所生成的数据样本组中的每个数据样本基于所计算的数据样本的距离被存储在相应的距离分数桶内。 从多个最高排名的距离得分桶中选择生成的数据样本。 将从最高排名距离得分桶数中选出的生成数据样本注入到少数数据类中。