Abstract

In recent years, application of feature selection methods in biological datasets has greatly increased. By using feature selection techniques, a subset of relevant informative features is obtained which results in more interpretable model improving the prediction accuracy. In addition, ensemble learning can further provide a more robust model by combining the results of multiple statistical learning models. We propose an algorithm that uses ensemble methods to select the features and build the classification model with selected features. Our proposed approach is a two-step and two-layer cross-validation method. The first step performs the feature selection in the inner loop of cross-validation, whereas the second step builds the classification model in the outer loop of cross-validation. The final classification model, obtained by using the proposed method, has a higher prediction accuracy than that using the standard cross-validation. The applications of the proposed method have been presented using both simulated and three real datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.