Abstract

Breast cancer is a common type of cancer affecting women worldwide. Continuous efforts are being made for the identification of significant genes/biomarkers for prognosis of breast cancer. These prognostic biomarkers are very useful to predict the resemblance between query patients (new) and reference patients (existing). Here, 1d-DDg (one-dimensional data-driven grouping) model has been used to make prognostic model for breast cancer diagnosis. The Cox proportional hazard regression model has been applied to select the predictive genes for breast cancer. Microarray gene expression data and clinical information have been used to select the predictive genes. Based on these biomarkers, patients are categorized into two groups, namely low-risk and high-risk groups. After that, the Manhattan distance has been applied to compute the resemblance/similarity between query (newly admitted) patients and the reference (existing) patients. Two breast cancer datasets with accession number GSE2990 and GSE45255 obtained from National Centre for Biotechnology Information (NCBI) data portal containing miRNA and mRNA expression profiles have been used in the experiential purpose. The clinical information, i.e., disease relapse, overall survival, body mass index, and age, is available in both the datasets. Microarray gene expression data along with clinical data have been considered to compute resemblance between query and reference patients. Regarding computing resemblance, the literature suggests that, the Manhattan distance is more appropriate for high-dimension vector/data compared to Euclidean distance. In this regard, a comparison has also been made between the Manhattan and Euclidean distance on the basis of elapsed time. The experimental result shows that the Manhattan distance executes faster than Euclidean distance. Therefore, for getting a faster response without losing the quality and accuracy of the solution, the ranking of reference patients has been performed using Manhattan distance. Treatment to query patient is provided based on reference patient occupying the first rank in resemblance. This Manhattan distance-based algorithm based on genetic as well as clinical data is a new approach for prognosis to breast cancer.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call