Data mining is one of the essential areas of research that is more popular in health organization. Heart disease is the leading cause of death in the world over the past 10 years. The healthcare industry gathers enormous amount of heart disease data which are not “mined” to discover hidden information for effective decision making. This research intends to provide a detailed description of Naïve Bayes, decision tree classifier and Selective Bayesian classifier that are applied in our research particularly in the prediction of Heart Disease. It is known that Naïve Bayesian classifier (NB) works very well on some domains, and poorly on some. The performance of NB suffers in domains that involve correlated features. C4.5 decision trees, on the other hand, typically perform better than the Naïve Bayesian algorithm on such domains. This paper describes a Selective Bayesian classifier (SBC) that simply uses only those features that C4.5 would use in its decision tree when learning a small example of a training set, a combination of the two different natures of classifiers. Experiments conducted on Cleveland datasets indicate that SBC performs reliably better than NB on all domains, and SBC outperforms C4.5 on this dataset of which C4.5 outperform NB. Some experiment has been conducted to compare the execution of predictive data mining technique on the same dataset, and the consequence reveals that Decision Tree outperforms over Bayesian classifier and experiment also reveals that selective Bayesian classifier has a better accuracy as compared to other classifiers.
Read full abstract