Abstract

Indonesia is one of the countries with the highest population density in the world with a very high number of Tuberculosis (TB). This TB disease is very serious because it is very easily transmitted through the air, namely, droplets that come from a TB patient who coughs or sneezes. In diagnosing a disease, missing data often occurs, resulting in researcher errors in the data collection process, so this study proposes the mean Imputation method to overcome missing data. For the classification of TB disease data in Bangkalan Regency, Indonesia, which consists of 886 data, the method used is Naive Bayes compared to Logistics Regression. For the distribution of training and testing data, this research uses multiple trains and tests K-Fold cross-validation with a total of k=10. Based on research trials using the mean imputation method is better than the one imputation method in filling in the missing data for this case with an average accuracy is 97.36% and the F1 score is 95.01% better than one imputation with an average accuracy is 97.35% and F1 score is 94.35 % on the Naive Bayes method. For TB classification, the Naive Bayes method produces an average accuracy is 97.36% and the F1 score is 95.01% better than the logistic regression method in classifying tuberculosis with an accuracy rate is 97.36% with an F1 score is 89.58%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call