Abstract

Radiomics is characterized by high-dimension and high redundancy. The existing Lasso-based feature selection does not consider features that are weakly correlated with the classification results, which will have a certain impact on the quality of feature subset. A multi-level feature selection algorithm based on Lasso coefficient threshold (Coe-Thr-Lasso) was proposed. Firstly, t-test and variance were used to remove the features that had little correlation with the classification results. Secondly, the proposed algorithm was used to remove features with redundancy and weak correlation of classification results. Three machine learning algorithms, including Logistic regression (LR), random forest (RF) and support vector machine (SVM), were verify the performance of the proposed algorithm on the non-small cell lung cancer subtype classification dataset. When modeling based on the feature subset generated by the proposed method, the proposed method achieved the best classification performance compared with other publication methods. Therefore, Coe-Thr-Lasso algorithm can effectively remove redundant and irrelevant features in radiomics, so as to improve the quality of feature subset and the ability of model generalization.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.