Abstract

Most organizations carried out their activities by design and develop a large volume of programmed documents as an essential element of their external and internal performance. When documents are well-known in a large volume of subject matter classification, the classifications are frequently prepared in order. Newsgroup and yahoo databases are two cases studied. This article indicates that the precision of a naïve Bayes text classifier can be importantly enhanced by taking benefit of a hierarchy of categories. A statistical approach known as shrinkage was adopted that levels variable prediction of a data-sparse child with its blood relation in direction to acquire more vigorous variable predictions. The test results on 3 real-time datasets from Yahoo, UseNet, and shared webpages display enhanced performance with about 29% error reduction over the customarily flat classifier.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call