Abstract

The naïve Bayes classifier is one of the commonly used data mining methods for classification. Despite its simplicity, naïve Bayes is effective and computationally efficient. Although the strong attribute independence assumption in the naïve Bayes classifier makes it a tractable method for learning, this assumption may not hold in real-world applications. Many enhancements to the basic algorithm have been proposed in order to alleviate the violation of attribute independence assumption. While these methods improve the classification performance, they do not necessarily retain the mathematical structure of the naïve Bayes model and some at the expense of computational time. One approach to reduce the naïveté of the classifier is to incorporate attribute weights in the conditional probability. In this paper, we proposed a method to incorporate attribute weights to naïve Bayes. To evaluate the performance of our method, we used the public benchmark datasets. We compared our method with the standard naïve Bayes and baseline attribute weighting methods. Experimental results show that our method to incorporate attribute weights improves the classification performance compared to both standard naïve Bayes and baseline attribute weighting methods in terms of classification accuracy and F1, especially when the independence assumption is strongly violated, which was validated using the Chi-square test of independence.

Highlights

  • The naïve Bayes classifier is one of the widely used algorithms for data mining applications

  • When we used information gain (IG) as the attribute weights, we obtained a similar result as in KL when compared to the standard naïve Bayes for accuracy

  • This paper proposed a new method to incorporate the attribute weights in the computation of conditional probabilities for naïve Bayes classifier

Read more

Summary

Introduction

The naïve Bayes classifier is one of the widely used algorithms for data mining applications. The naïveté in the classifier is that all the attributes are assumed to be independent given the class. Such assumption simplifies the computation to infer the probability of a class given the data. The attribute independence assumption in the naïve Bayes classifier makes it a tractable method for learning, this assumption may not hold in real-world applications. One of the approaches is to combine the naïve Bayes with a pre-processing step. In this approach, an attribute selection is first performed to identify the set of informative attributes before training the naïve Bayes

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.