Abstract

Feature selection is of great significance in processing high-dimensional data, which can save cost of computation and improve the performance of analysis. Tree based classifiers have been gaining their popularity due to their great performance and their extended feature selection methods also have been widely adopted for dimension reduction of high dimensional dataset. However, which specific tree-based feature selection method is most suitable for feature selection task for omics dataset has not been comprehensive investigated. In this work, we compare the performance of different tree-based feature selection (SVM, Random Forest, Logistic Regression) methods on high-dimensional data of biological omics. The results indicate that GBDT performs best, XGBoost performs slightly worse than GBDT while the performance of RF is the worst.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.