Abstract

When compared with other types of cancer, most of the population with cancer die from lung cancer.A person needs to do a screening test through X-rays, CT scans, and MRI to detect the disease. However, before carrying out the process, the doctor will ordinarily investigate a medical history and physical examination first to study the symptoms and possible risk factors for lung cancer. The lung cancer data set has a class imbalance that affects the performance of the random forest algorithm in predicting the risk of lung cancer. This study aims to employ the SMOTE technique to the random forest algorithm to increase accuracy in predicting lung cancer risk. In this research, data processing and analysis use the Python programming language. The test results show an accuracy value of 88% with an AUC value of 0.93. When employing the random forest method to forecast lung cancer risk, the SMOTE technique is useful in dealing with class imbalances in the data set.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.