A hierarchical clustering approach for colorectal cancer molecular subtypes identification from gene expression data

Shivangi Raghav,Aastha Suri,Deepika Kumar,Aakansha Aakansha,Muskan Rathore,Sudipta Roy

doi:10.1016/j.imed.2023.04.002

Shivangi Raghav, Aastha Suri + Show 4 more

Open Access

https://doi.org/10.1016/j.imed.2023.04.002

Copy DOI

Journal: Intelligent Medicine	Publication Date: May 14, 2023
Citations: 3	License type: cc-by-nc-nd

Affiliation: Bharati Vidyapeeth Deemed University

Abstract

BackgroundColorectal cancer (CRC) is the second leading cause of cancer fatalities and the third most common human disease. Identifying molecular subgroups of CRC and treating patients accordingly could result in better therapeutic success compared with treating all CRC patients similarly. Studies have highlighted the significance of CRC as a major cause of mortality worldwide and the potential benefits of identifying molecular subtypes to tailor treatment strategies and improve patient outcomes. MethodsThis study proposed an unsupervised learning approach using hierarchical clustering and feature selection to identify molecular subtypes and compares its performance with that of conventional methods. The proposed model contained gene expression data from CRC patients obtained from Kaggle and used dimension reduction techniques followed by Z-score-based outlier removal. Agglomerative hierarchy clustering was used to identify molecular subtypes, with a P-value-based approach for feature selection. The performance of the model was evaluated using various classifiers including multilayer perceptron (MLP). ResultsThe proposed methodology outperformed conventional methods, with the MLP classifier achieving the highest accuracy of 89% after feature selection. The model successfully identified molecular subtypes of CRC and differentiated between different subtypes based on their gene expression profiles. ConclusionThis method could aid in developing tailored therapeutic strategies for CRC patients, although there is a need for further validation and evaluation of its clinical significance.

Full Text