Abstract
In recent years, application of feature selection methods in biological datasets has greatly increased. By using feature selection techniques, a subset of relevant informative features is obtained which results in more interpretable model improving the prediction accuracy. In addition, ensemble learning can further provide a more robust model by combining the results of multiple statistical learning models. We propose an algorithm that uses ensemble methods to select the features and build the classification model with selected features. Our proposed approach is a two-step and two-layer cross-validation method. The first step performs the feature selection in the inner loop of cross-validation, whereas the second step builds the classification model in the outer loop of cross-validation. The final classification model, obtained by using the proposed method, has a higher prediction accuracy than that using the standard cross-validation. The applications of the proposed method have been presented using both simulated and three real datasets.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Communications in Statistics - Simulation and Computation
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.