Abstract

Each document in a multi-label classification is connected to a subset of labels. These documents usually include a big number of features, which can hamper the performance of learning algorithms. Therefore, feature selection is helpful in isolating the redundant and irrelevant elements that can hold the performance back. The current study proposes a Naive Bayesian (NB) multi-label classification algorithm by incorporating a wrapper approach for the strategy of feature selection aiming at determining the best minimum confidence threshold. This paper also suggests transforming the multi-label documents prior to utilizing the standard algorithm of feature selection. In such a process, the document was copied into labels that belonged to by adopting all the assigned characteristics for each label. Then, this study conducted an evaluation of seven minimum confidence thresholds. Additionally, Class Association Rules (CARs) represents the wrapper approach for this evaluation. The experiments carried out with benchmark datasets revealed that the Naive Bayes Multi-label (NBML) classifier with business dataset scored an average precision of 87.9% upon using a 0.1 % of minimum confidence threshold.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call