Abstract

In classification problems, cross-validation chooses random samples from the dataset in order to improve the ability of the model to classify properly new observations in the respective class. Research articles from various fields show that when applied to regression problems, the bootstrap can improve either the prediction ability of the model or the ability for feature selection. The purpose of our research is to show that the bootstrap as a model selection procedure in classification problems can outperform cross-validation. We compare the performance measures of cross-validation and the bootstrap on a set of classification problems and analyse their practical advantages and disadvantages. We show that the bootstrap procedure can accelerate execution time compared to the cross-validation procedure while preserving the accuracy of the classification model. This advantage of the bootstrap is particularly important in big datasets as the time needed for fitting the model can be reduced without decreasing the model's performance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.