Abstract

The exponential growth of Internet through sharing text content necessitates the analysis to convert them into useful information. The research areas such as Web mining, Opinion mining and Text mining focus on studies namely content mining, statistical analysis, prediction, and classification. Mult inomial Naïve Bayes (MNB), the state of art of Bayesian classifier is the fastest and simplest text classifier. The objective of the proposed study is to enhance the classification by substituting the conditional probability of existing MNB with probability based frequency computation. A new combination that consists of Pointwise Mutual Information (PMI) and different normalized Term Frequency (TF) is used for computing the conditional probability. The new combinations provide weight to the words based on the information gain carried by the words related to the document that belongs to a class. The robustness of Similarity based Enhanced Conditional Probability MNB (SECP-MNB) is reflected in classification accuracy measurement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call