Abstract

It is a common practice to handle labeled data with classifiers and unlabeled ones with clusterings. The traditional Bayesian network classifiers (BNC$^{\mathcal {T}}\text{s}$ ) learned from labeled training set $\mathcal {T}$ directly map the unlabeled test instance into the network structure to calculate the conditional probability for the classification, which neglects the information hidden in the unlabeled data and will result in classification bias. To address this issue, we propose a novel learning framework, called model matching, that uses the “clustering” strategy to solve the classification problem. The labeled data is divided into several clusters according to the different class label to learn a set of BNC$^{\mathcal {T}}\text{s}$ and a corresponding set of BNC$^{p}\text{s}$ is built for each unlabeled test instance. To make a classification, the cross entropy method is applied to compare the structural similarity between BNC$^{\mathcal {T}}$ and BNC p . The extensive experimental results on 46 datasets from the University of California at Irvine (UCI) machine learning repository demonstrate that for BNCs model matching helps improve the generalization performance and outperforms the several state-of-the-art classifiers like tree-augmented naive Bayes and Random forest.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.