Abstract

Bayesian network classifiers (BNCs) have demonstrated competitive classification accuracy in a variety of real-world applications. However, it is error-prone for BNCs to discriminate among high-confidence labels. To address this issue, we propose the label-driven learning framework, which incorporates instance-based learning and ensemble learning. For each testing instance, high-confidence labels are first selected by a generalist classifier, e.g., the tree-augmented naive Bayes (TAN) classifier. Then, by focusing on these labels, conditional mutual information is redefined to more precisely measure mutual dependence between attributes, thus leading to a refined generalist with a more reasonable network structure. To enable finer discrimination, an expert classifier is tailored for each high-confidence label. Finally, the predictions of the refined generalist and the experts are aggregated. We extend TAN to LTAN (Label-driven TAN) by applying the proposed framework. Extensive experimental results demonstrate that LTAN delivers superior classification accuracy to not only several state-of-the-art single-structure BNCs but also some established ensemble BNCs at the expense of reasonable computation overhead.

Highlights

  • Supervised classification is a fundamental issue in machine learning and data mining

  • The label conditional mutual information (LCMI) I ( Xi ; X j |c) metric that is adopted by BMCCL for computing edge weights of local Chow–Liu trees is invariant for all testing instances

  • Our framework is intended to alleviate this problem. To determine whether it is successful at this goal, we propose two new criteria: correction percentage (CP) and loss percentage

Read more

Summary

Introduction

Supervised classification is a fundamental issue in machine learning and data mining. To explore possible attribute dependencies that exist in testing data, instance-based BNCs [4] learn the most appropriate network structure for each testing instance at classification time. In the label filtering stage, by exploiting credible information derived from high-confidence labels, a refined generalist with a more accurate network structure is deduced for each reconsidered testing instance. In the label specialization stage, a Bayesian multinet classifier (BMC) is built to model label-specific causal relationships in the context of a particular reconsidered testing instance. Rather than relying on the decision of a single classifier, the framework averages the predictions of the refined generalist and the experts, which possesses the merits of ensemble learning.

Information Theory
Bayesian Network Classifiers
Naive Bayes
TAN and WATAN
Bayesian Multinet Classifiers
Motivation
Label Filtering Stage
Label Specialization Stage
Overall Structure and Complexity Analysis
Empirical Study
Selection of the Threshold for Label Filtering
Comparisons in Terms of Zero-One Loss
Analysis of the Label-Driven Learning Framework
Effects of the Label Filtering Stage
Effects of the Label Specialization Stage
Findings
Discussion
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.