<p>Breast cancer is the most frequent cancer in women and the second-leading cause of cancer-related deaths globally. The main problems in managing breast cancer are high heterogeneity and the formation of therapeutic resistance. White blood cells, omics and large Wisconsin diagnostic breast cancer datasets present the three-decade genomic revolution and advance the understanding of cellular function. The precision of cancer diagnosis has also increased over the past decades. High throughput sequencing, screening, and artificial intelligence technologies have significantly improved and increased the methodologies used for diagnosis, prognosis, and therapy. This paper follows several phases of breast cancer, studies datasets and evaluate many algorithms of machine learning (ML) used for analysis and feature selection i.e. k-means, similarity correlation, genetic algorithm, and principal component analysis, have been used to recognize the subset of proteins with the highest significance for breast cancer prediction by using different biomarkers. The best correlation, as determined by Pearson correlation, between copy number and protein is 0.014, and the accuracy achieved by the genetic algorithm is 93.5% using multi-omics datasets.</p>
Read full abstract