Abstract

Thyroid disease is spreading very rapidly among women after the age of 30 years. Therefore, it is necessary to examine the thyroid dataset for predicting the disease at early stage so that precautions can be taken to protect the dangerous condition of thyroid cancer. A decision tree is used to extract hidden patterns from the stored datasets. The objective of this research paper is to examine the thyroid disease dataset using decision tree, random forest, and classification and regression tree (CART), and after obtaining the results of these classifiers, we enhanced the results using the bagging ensemble technique. The proposed experiment was done on 3710 instances and 29 features of thyroid patients. The overall prediction depends on target variable whch is divided in sick and negative class. The accuracy of the prediction was calculated on the basis of different num-fold and seed values. Different classification algorithms are analyzed using thyroid dataset. The results obtained by individual classification algorithms like decision tree, random forest tree, and extra tree give an accuracy of 98%, 99%, and 93%, respectively. Then, we developed a bagging ensemble method combining the three basic tree classifiers and apply again on the same dataset, which gives a better accuracy of 100% in the case of seed value 35 and num-fold value 10. This proposed ensemble method can be used for better prediction of thyroid disease.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call