Abstract

Breast cancer is the most common cause of cancer death among women and the types of cancer most attacked women in 140 of 184 countries in the world. There are several factors known related to encourage an increased risk of breast cancer, but it will be different in every country which that factors depends on the treatment routinely. This research examines the determinant factors of breast cancer that occurs in Indonesia, slightly different to the United State of America case. In the United State of America most women come to the hospital in the early stages of breast cancer and they will get medical treatment very soon, therefore it decreases the risk of malignant breast cancer. The dataset was originally taken from one of Oncology Hospital in East Java, Indonesia, which consist of 1907 samples, 21 attributes and 2 classes. We used three different feature selection algorithms which are Information Gain, Fisher's Discriminant Ratio and Chi-square to select the best attributes that have great contribution to the data. We also applied Hierarchical K-means Clustering to remove attributes which have lowest contribution. Our experiments showed that only 14 of 21 original attributes have the highest contribution factor of the breast cancer data. The clustering algorithm error ratio was decreased from 44.48% (using 21 original attributes) to 18.32% (using 14 most important attributes).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.