
基本信息:
- 专利标题: MACHINE LEARNING SYSTEM AND APPARATUS FOR SAMPLING LABELLED DATA
- 申请号:US16423315 申请日:2019-05-28
- 公开(公告)号:US20200175418A1 公开(公告)日:2020-06-04
- 发明人: Vincent PHAM , Reza FARIVAR , Jeremy GOODSITT , Fardin Abdi Taghi ABAD , Anh TRUONG , Mark WATSON , Austin WALTERS
- 申请人: Capital One Services, LLC
- 主分类号: G06N20/00
- IPC分类号: G06N20/00 ; G06F16/2457 ; G06N5/04
摘要:
A database including various datasets and metadata associated with each respective dataset is provided. These datasets were used to train predictive models. The database stores a performance value associated with the model trained with each dataset. When provided with a new dataset, a server can determine various metadata for the new dataset. Using the metadata, the server can search the database and retrieve datasets which have similar metadata values. The server can narrow the search based on the performance value associated with the dataset. Based on the retrieved datasets, the server can recommend at least one sampling technique. The sampling technique can be determined based on the one or more sampling techniques that were used in association with the retrieved datasets.