Abstract
This research aims to analyze the performance of three classification models, namely Decision Tree Classifier, Support Vector Machine, and Naive Bayes Classifier, in predicting lung cancer using the "Lung Cancer Prediction" dataset. The performance evaluation metrics used include accuracy, precision weighted, recall weighted, and F1 weighted. As a preliminary step, exploratory data analysis (EDA) and dataset preprocessing, including feature selection, data cleaning, and data transformation, were conducted. The test data results showed that the Decision Tree Classifier and Naive Bayes Classifier had similar performances with high accuracy, precision, recall, and F1 values. Meanwhile, the Support Vector Machine also exhibited competitive performance, although its precision weighted value was slightly lower. Additionally, an outlier analysis was conducted using box plots, revealing that the Decision Tree Classifier had 2 outlier values, while the Support Vector Machine had 4 outlier values, and Naive Bayes had no outlier values. In conclusion, all three classification models demonstrated good potential in lung cancer prediction. However, selecting the best model requires consideration of relevant evaluation metrics for the application and accommodating the limitations of each model. Further evaluation and in-depth analysis are needed to ensure the reliability of the models in predicting lung cancer cases more accurately and consistently.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.