Learning Bayesian network classifiers (BNCs) from data is NP-hard. Of numerous BNCs, averaged one-dependence estimator (AODE) performs extremely well against more sophisticated newcomers, and its trade-off between bias and variance can be attributed to the independence assumption and i.i.d. assumption, which respectively address the issues of structure complexity and data complexity. To alleviate these assumptions and improve AODE, we propose to apply double weighting, including attribute weighting and model weighting, to finely tune the estimates of conditional probability based on generative learning and joint probability based on discriminative learning, respectively. Instance weighting is introduced to define the information-theoretic metrics for identifying the variation in probability distributions for different data points. This highly scalable learning approach can establish a decision boundary that is specifically tailored to each instance. Our extensive experimental evaluation on 34 datasets from the UCI machine learning repository shows that, attribute weighting and model weighting are complementary although they can work separately. The proposed AODE applying double weighting schema, called DWAODE, is a competitive alternative to other weighting approaches. The experimental results show that DWAODE demonstrates significant advantage in terms of zero–one loss, bias–variance decomposition, RMSE (root mean squared error), Friedman and Nemenyi tests.
Read full abstract