Abstract

Abstract This paper deals with the topic of learning from unlabeled or noisy-labeled data in the context of a classification problem. In the classification problem the outcome yields one of a discrete set of values thus, assumptions on them could be established to obtain the most likely prediction model at the training stage . In this paper, a novel case-based model selection method is proposed, which combines hypothesis testing from a discrete set of expected outcomes and feature extraction within a cross-validated classification stage. This wrapper-type procedure acts on fully-observable variables under hypothesis-testing and improves the classification accuracy on the test set, or keeps its performance at least at the level of the statistical classifier. The model selection strategy in the cross validation loop allows building an ensemble classifier that could improve the performance of any expert and intelligence system, particularly on small sample-size datasets. Experiments were carried out on several databases yielding a clear improvement on the baseline, i.e., SPECT dataset A c c = 86.35 ± 1.51 , with S e n = 91.10 ± 2.77 , and S p e = 81.11 ± 1.61 . In addition, the CV error estimate for the classifier under our approach was found to be an almost unbiased estimate (as the baseline approach) of the true error that the classifier would incur on independent data.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.