Machine Learning Algorithms for Prediction of Survival Curves in Breast Cancer Patients.

Roqia Saleem Awad Maabreh,Malik Bader Alazzam,Ahmed S Alghamdi,Fahd Abd Algalil

doi:10.1155/2021/9338091

Abstract

Today, cancer is the second leading cause of death worldwide, and the number of people diagnosed with the disease is expected to rise. Breast cancer is the most commonly diagnosed cancer in women, and it has one of the highest survival rates when treated properly. Because the effectiveness and, as a result, survival of the patient are dependent on each case, it is critical to know the modelling of their survival ahead of time. Artificial intelligence is a rapidly expanding field, and its clinical applications are following suit (having surpassed humans in many evidence-based medical tasks). From the inception of since first stable risk estimator based on statistical methods appeared in survival analysis, there have been numerous versions of it created, with machine learning being used in only a few of them. Nonlinear relationships between variables and the impact they have on the variable to be predicted are very easy to evaluate using statistical methods. However, because they are just mathematical equations, they have flaws that limit the quality of their output. The main goal of this study is to find the best machine learning algorithms for predicting the individualised survival of breast cancer patients, as well as the most appropriate treatment, and to propose new numerical variable stratifications. They will still be carried out using unsupervised machine learning methods that divide patients into groups based on their risk in each dataset. We will compare it to standard groupings to see if it has more significance. Knowing that the greatest challenge in dealing with clinical data is its quantity and quality, we have gone to great lengths to ensure their quality before replicating them. We used the Cox statistical method in conjunction with other statistical methods and tests to find the best possible dataset with which to train our model, despite its ease of multivariate analysis.

Highlights

Cancer is the second leading cause of death worldwide, with an estimated 9.6 million deaths in 2018 (1 in 6 deaths), and the cases of diagnosis and deaths from it continue to increase each year [1]
We will analyze the behaviour of automatic learning models to predict death and recurrence against statistical methods, while we will take advantage of the advantages that the latter offer to select a set of variables
The second objective was to discover whether, for the numerical variables of our sets, the stratifications created by unsupervised machine learning that adapt to the data better explain the risk of death and recurrence than those used in a standard way

Summary

Introduction

Cancer is the second leading cause of death worldwide, with an estimated 9.6 million deaths in 2018 (1 in 6 deaths), and the cases of diagnosis and deaths from it continue to increase each year [1]. Its main application in medicine is to analyze events of interest: death, relapse, adverse reaction to a drug, or the development of a new disease For all these cases, it is possible to model and know the risk of the event taking place in a range of time from weeks to years depending on the case. We can find some studies that have applied computational learning [5], and the precision results of these increase significantly In this project, we will analyze the behaviour of automatic learning models to predict death and recurrence against statistical methods, while we will take advantage of the advantages that the latter offer to select a set of variables. Optimal with which to train our model, a task in which they have given very good results over time

Objectives

Methods

Discussion

Conclusion