Abstract

Abstract Mesothelioma is an aggressive lung cancer, harms the linings of the lungs. It is one of the deadliest cancers diagnosed in those exposed to fibrous silicate minerals (asbestos). Millions of people face severe consequences as they are diagnosed at late stages. This study presents a comparison of several machine learning approaches with distinct feature sets and addresses the issue of class imbalance. The dataset used in this study is available publicly on the University of California Irvine (UCI) machine learning repository. This study used the resampling technique, synthetic minority oversampling technique (SMOTE), and adaptive synthetic sampling (ADASYN) to handle the class imbalance. Most of the machine learning strategies performed well with the resampling technique. The best accuracy using the resampling strategy was achieved by artificial neural networks (ANN). The highest accuracy was recorded on the feature set selected by principal component analysis (PCA) is 96%. Overall, ensemble techniques performed well. The proposed stacking-based classifier achieved the highest accuracy (89%) on data balanced using SMOTE and ADASYN.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.