Abstract

BRCA1/2 gene testing is a difficult, expensive, and time-consuming test which requires excessive work load. The identification of the BRCA1/2 gene mutations is significantly important in the selection of treatment and the risk of secondary cancer. We aimed to develop an algorithm considering all the clinical, demographic, and genetic features of patients for identifying the BRCA1/2 negativity in the present study. An experimental dataset was created with the collection of the all clinical, demographic, and genetic features of breast cancer patients for 20 years. This dataset consisted of 125 features of 2070 high-risk breast cancer patients. All data were numeralized and normalized for detection of the BRCA1/2 negativity in the machine learning algorithm. The performance of the algorithm was identified by studying the machine learning model with the test data. k nearest neighbours (KNN) and decision tree (DT) accuracy rates of 9 features involving Dataset 2 were found to be the most effective. The removal of the unnecessary data in the dataset by reducing the number of features was shown to increase the accuracy rate of algorithm compared with the DT. BRCA1/2 negativity was identified without performing the BRCA1/2 gene test with 92.88% accuracy within minutes in high-risk breast cancer patients with this algorithm, and the test associated result waiting stress, time, and money loss were prevented. That algorithm is suggested be useful in fast performing of the treatment plans of patients and accurately in addition to speeding up the clinical practice.

Highlights

  • Machine learning is a computer-based predictive method or an estimation algorithm which makes an assumption of a hypothesis and uses this assumption for the estimation of the unknown condition using various mathematical and statistical methods

  • The question of how machine learning will enter in the practice in medicine comes to mind

  • The raw dataset involving 125 features of 2070 high-risk breast cancer patients was used for the detection of the BRCA1/2 negativity

Read more

Summary

Introduction

Machine learning is a computer-based predictive method or an estimation algorithm which makes an assumption of a hypothesis and uses this assumption for the estimation of the unknown condition using various mathematical and statistical methods. The biological and medical data produced in the area of biology and medicine have become more heterogeneous and complex with the rapid development of highly productive technologies in recent years, and the evaluation of these data becomes highly difficult with the known statistical analyses. More different methods such as advanced machine learning algorithms are required considering this rapid development in biology and Disease Markers medicine and with the increase of the size of these data. The investigation of the literature studies showed that machine learning is successfully practiced in resolving numerous problems such as classification, regression, and clustering [9, 10]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call