Abstract

One of the main causes of death is cancer. The most common cancer in women is breast cancer. Breast cancer (Carcinoma mammae) is defined as a malignant neoplasm originating from the parenchyma. Breast cancer ranks first in terms of the highest number of cancers in Indonesia and is one of the first contributors to cancer deaths. Globocan data in 2020, the number of new cases of breast cancer reached 68,858 cases (16.6%) of the total 396,914 new cases of cancer in Indonesia. Meanwhile, the number of deaths reached more than 22 thousand cases (Romkom, 2022). This death rate is increasing due to lack of information about the early symptoms and dangers of breast cancer itself. From this lack of information, a system is needed that can provide information about breast cancer such as early diagnosis. Classification data mining techniques can be used to predict which patients will develop breast cancer and which do not with several parameters. In this study, a comparison of the classification of breast cancer using the Decision Tree ID3 algorithm and the K-Nearest Neighbors algorithm will be carried out. Attribute data used consists of Menopause, Tumor-Size, Node-Caps, Deg-Malig, Breast, Breast-Squad and Irradiant. The main objective of this study is to improve classification performance in breast cancer diagnosis by applying feature selection to several classification algorithms. The Decision Tree ID3 algorithm has an accuracy rate of 93.333% and the K-Nearest Neighbors algorithm has an accuracy rate of 76.6667%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call