Breast cancer identification can be analyzed through genomic analysis using gene expression data, one type of which is mRNA. This involves analyzing gene expression patterns of breast tissue samples to distinguish breast cancer from healthy tissue or to differentiate subtypes of different breast cancers. This research developed the right computational model for breast cancer classification using machine learning and hyperparameter optimization algorithms. The primary objective of this research is to utilize various machine learning algorithms to classify breast cancer based on gene expression and enhance the models developed in previous studies. This paper provides an extensive literature review of prior breast cancer classification research and offers new theoretical perspectives. This research used a problem-solving approach with conventional machine learning techniques, most notably the decision tree. It also evaluates other machine learning algorithms for comparison, including k-nearest neighbor, naïve bayes, random forest, extra tree classifier, and support vector machine. The evaluation process used classification reports that provide insight into the precision, recall, F1-score, and accuracy of each machine learning model. The evaluation results show that the performance of the decision tree algorithm model is superior and impressive, achieving 99.73% accuracy and a score of 1 for precision, recall, and F1-score.
Read full abstract