A Machine Learning Classifier Based on Statistical Method for Small Number of Samples

Lin XU, Yan LI

Abstract


Due to the small number of train samples, the classifier with ideal generalization ability for small sample size is not easy to obtain. However, the problem of constructing small sample classifier exists widely in the real world, especially in the field of biological medicine. Therefore, building a classifier based on small number of samples has become a research hotspot. In this study, we proposed a machine learning classifier based on statistical methods to solve this problem. In this method, Bootstrap, chi-square test and other statistical methods were firstly combined to solve the performance evaluation of multiple machine learning classifiers under small sample sizes dataset. Then, the application of Youden index in machine learning classifiers was optimized to meet the requirements of clinical application. The simulation experiment on the UCI breast cancer data set shows that this method is more stable and accurate for the performance evaluation, and at the same time,optimized Youden index in machine learning classifier is able to flexibly meet the application requirements of small number of samples in the medical research field.


DOI
10.12783/dtcse/iccis2019/31997

Refbacks

  • There are currently no refbacks.