Abstract
Background: Thyroid cancer recurrence poses a significant challenge in oncology, necessitating effective tools for early prediction. Machine learning models offer the potential to improve prognostic accuracy and guide clinical decision-making. Aim and Objective: This study aims to investigate the efficacy of machine learning models in predicting thyroid cancer recurrence using a publicly available dataset comprising 17 features. Methods: We explored multiple machine learning algorithms, including Logistic Regression, K-Nearest Neighbors (KNN), Random Forest, and AdaBoost, to develop predictive models. The target variable was the "Recurred" column, indicating whether a patient experienced recurrence. Performance evaluation was conducted using metrics such as Accuracy, Precision, Recall, F1 Score, and ROC AUC. A correlation heatmap was generated to assess relationships between features and detect multicollinearity, while feature importance analysis using the Random Forest model identified key predictors. Results: Among the models, the Random Forest classifier achieved the highest performance on the test dataset, with an Accuracy of 0.9818, Precision of 0.9623, Recall of 1.0000, F1 Score of 0.9808, and ROC AUC of 0.9831. The feature importance analysis highlighted critical factors influencing recurrence prediction, while the correlation heatmap provided insights into feature interactions. Conclusion: This study demonstrates the effectiveness of machine learning models, particularly the Random Forest classifier, in predicting thyroid cancer recurrence. The insights gained from feature analysis and correlation studies contribute to model interpretability and future feature selection strategies. These findings emphasize the potential of machine learning in improving patient outcomes through early and accurate recurrence prediction.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have