Abstract

Of numerous proposals to improve naive Bayes (NB) by weakening its attribute independence assumption, SuperParent (SP) has demonstrated remarkable classification performance. In many real-world applications, however, accurate class probability estimation of instances is more desirable than simple classification. For example, we often need to recommend commodities to customers with the higher likelihood (class probability) of purchase. Conditional log likelihood (CLL) is currently a well-accepted measure for the quality of class probability estimation. Inspired by this, in this paper, we firstly investigate the class probability estimation performance of SP in terms of CLL and find that its class probability estimation performance almost ties the original distribution-based tree augmented naive Bayes (TAN). In order to scale up its class probability estimation performance, we then propose an improved CLL-based SuperParent algorithm (CLL-SP). In CLL-SP, a CLL-based approach, instead of a classification-based approach, is used to find the augmenting arcs. The experimental results on a large suite of benchmark datasets show that our CLL-based approach (CLL-SP) significantly outperforms the classification-based approach (SP) and the original distribution-based approach (TAN) in terms of CLL, yet at the same time maintains the high classification accuracy that characterizes the classification-based approach (SP).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.