Abstract

Background & Objective: Colorectal cancer (CRC) is one of the most prevalent malignancies in the world. The early detection of CRC is not only a simple process but also is the key to treatment. Data mining algorithms could be potentially useful in cancer prognosis, diagnosis, and treatment. Therefore, the main focus of this study is to measure the performance of some data mining classifier algorithms in predicting CRC and providing an early warning to the high-risk groups. Materials & Methods: This study was performed on 468 subjects, including 194 CRC patients and 274 non-CRC cases. We used the CRC dataset from Imam Hospital, Sari, Iran. The Chi-square feature selection method was utilized to analyze the risk factors. Next, four popular data mining algorithms were compared in terms of their performance in predicting CRC, and, finally, the best algorithm was identified. Results: The best outcome was obtained by J-48 with F-measure=0.826, receiver operating characteristic (ROC)=0.881, precision=0.826, and sensitivity =0.827. Bayesian net was the second-best performer (F-Measure=0.718, ROC=0.784, precision=0.719, and sensitivity=0.722) followed by random forest (F-Measure=0.705, ROC=0.758, precision=0.719, and sensitivity=0.712). The multilayer perceptron technique had the worst performance (F-Measure=0.702, ROC=0.76, precision=0.701, and sensitivity=0.703). Conclusion: According to the results of this study, J-48 could provide better insights than other proposed prediction models for clinical applications. © 2021, Zanjan University of Medical Sciences and Health Services. All rights reserved.

Highlights

  • Colorectal cancer (CRC) is the most common gastrointestinal malignancy and the third leading cause of mortality in the world [1, 2]

  • The best outcome was obtained by J-48 with F-measure=0.826, receiver operating characteristic (ROC)=0.881, precision=0.826, and sensitivity =0.827

  • We obtained 15 clinical features as the most important risk factors of CRC prediction according to Equation 2 and 3

Read more

Summary

Introduction

Colorectal cancer (CRC) is the most common gastrointestinal malignancy and the third leading cause of mortality in the world [1, 2]. The CRC remains a critical challenge for communities’ health with the estimated annual new case and mortality of one million and a half million, respectively [3, 4]. The incidence of CRC has risen in low-income countries constantly over the past few decades [5, 6]. This disease is becoming the first cause of cancer-related death in Asian developing countries [7]. The CRC growth rate in Iran is expected to double over the two decades and is considered a critical health challenge [9]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call