Abstract

Genomic profiles among different breast cancer survivors who received similar treatment may provide clues about the key biological processes involved in the cells and finding the right treatment. More specifically, such profiling may help personalize the treatment based on the patients’ gene expression. In this paper, we present a hierarchical machine learning system that predicts the 5-year survivability of the patients who underwent though specific therapy; The classes are built on the combination of two parts that are the survivability information and the given therapy. For the survivability information part, it defines whether the patient survives the 5-years interval or deceased. While the therapy part denotes the therapy has been taken during that interval, which includes hormone therapy, radiotherapy, or surgery, which totally forms six classes. The Model classifies one class vs. the rest at each node, which makes the tree-based model creates five nodes. The model is trained using a set of standard classifiers based on a comprehensive study dataset that includes genomic profiles and clinical information of 347 patients. A combination of feature selection methods and a prediction method are applied on each node to identify the genes that can predict the class at that node, the identified genes for each class may serve as potential biomarkers to the class’s treatment for better survivability. The results show that the model identifies the classes with high-performance measurements. An exhaustive analysis based on relevant literature shows that some of the potential biomarkers are strongly related to breast cancer survivability and cancer in general.

Highlights

  • Despite the fast increase in the breast cancer incidence rate, the survival rates have increased due to improvements in the treatments because of new technologies (Siegel et al, 2016)

  • The developed multi-class model shows the final results for each node and the performance measures that were considered, such as accuracy, sensitivity, F1-measure, and specificity

  • The second node is obtained after removing the Deceased and Hormone (DH) instances from the dataset and classifying each class against the rest

Read more

Summary

INTRODUCTION

Despite the fast increase in the breast cancer incidence rate, the survival rates have increased due to improvements in the treatments because of new technologies (Siegel et al, 2016). Mangasarian and Wolberg (2000) utilized a linear support vector machine (SVM) to extract 6 out of 31 clinical features Their dataset contains samples from 253 breast cancer patients. Using samples from patients with high-risk clinical features in the early stages of breast cancer, Cardoso et al (2016) proposed the use of a statistical model to determine the necessity of chemotherapy treatment based on clinical data. The patients have received different therapies with sometimes mixed of them (Paredes-Aracil et al, 2017), which makes it difficult to relate the genomic activities to a specific therapy during the survival prediction In this present paper, we are extending an earlier supervised learning model that shows preliminary results to predict which BC patients will survive beyond 5 years after undergoing a given treatment therapy (Tabl et al, 2018b). AI376590 OR2B3 DSCAM been refined and validated by comparison with feature Selection approach mRMD 2.0, visual analysis, and biological validation for set of 12 potential biomarkers (FGF16, ASAP1, FBXO41, FOSB, VAMP4, ARFGAP2, BLP, CT47A1, PRPS1, ICOSLG, ARPC3, ZFP91) from the resulting 47 genes in all classification nodes

MATERIALS AND METHODS
RESULTS AND DISCUSSION
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.