Abstract

BackgroundColorectal cancer (CRC) is the second leading cause of cancer fatalities and the third most common human disease. Identifying molecular subgroups of CRC and treating patients accordingly could result in better therapeutic success compared with treating all CRC patients similarly. Studies have highlighted the significance of CRC as a major cause of mortality worldwide and the potential benefits of identifying molecular subtypes to tailor treatment strategies and improve patient outcomes. MethodsThis study proposed an unsupervised learning approach using hierarchical clustering and feature selection to identify molecular subtypes and compares its performance with that of conventional methods. The proposed model contained gene expression data from CRC patients obtained from Kaggle and used dimension reduction techniques followed by Z-score-based outlier removal. Agglomerative hierarchy clustering was used to identify molecular subtypes, with a P-value-based approach for feature selection. The performance of the model was evaluated using various classifiers including multilayer perceptron (MLP). ResultsThe proposed methodology outperformed conventional methods, with the MLP classifier achieving the highest accuracy of 89% after feature selection. The model successfully identified molecular subtypes of CRC and differentiated between different subtypes based on their gene expression profiles. ConclusionThis method could aid in developing tailored therapeutic strategies for CRC patients, although there is a need for further validation and evaluation of its clinical significance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call