Abstract

Text classification is recognized as one of the key techniques basically used for classifying the text in different classes including positive, negative and neutral. This paper illustrates the Odia text classification process using Naïve Bayes Algorithm, which is a very fast growing discipline in computer science. This classification algorithm is suitable for binary and multiclass classification in machine learning concept and belongs to supervised classification category used to classify future objects by assigning class labels to instances using conditional probability. In this paper, an auxiliary feature method for Odia text is proposed. It determines features by an existing feature selection method of Naïve bayes algorithm, and selects an auxiliary feature which can classify the text at the selected features, and then the chosen conditional probability is used to improve high classification accuracy. Illustrative examples are shown that the proposed method increases the performance of Naïve Bayes classifier. Around one thousand sentences are taken to be considered as both training and tested for empirical diagnosis of the proposed work. Accuracy is depending on mainly training the corpus size which is designed by own and may be increased to some extend depend on the parameters as well as the large size of corpus further. The result shows that Naïve Bayes technique significantly outperform many other technique like HMM (Hidden Markov Model), CRF (Conditional Random Field) and KNN (K Nearest Neighbourhood). Text classification plays an important role in Sentiment Analysis, Information Extraction, Text Summarization, Text Retrieval, and Question Answering.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.