Abstract

Text classification is one of the widely used phenomena in different natural language processing tasks. State-of-the-art text classifiers use the vector space model for extracting features. Recent progress in deep models, recurrent neural networks those preserve the positional relationship among words achieve a higher accuracy. To push text classification accuracy even higher, multi-dimensional document representation, such as vector sequences or matrices combined with document sentiment, should be explored. In this paper, we show that documents can be represented as a sequence of vectors carrying semantic meaning and classified using a recurrent neural network that recognizes long-range relationships. We show that in this representation, additional sentiment vectors can be easily attached as a fully connected layer to the word vectors to further improve classification accuracy. On the UCI sentiment labelled dataset, using the sequence of vectors alone achieved an accuracy of 85.6%, which is better than 80.7% from ridge regression classifier—the best among the classical technique we tested. Additional sentiment information further increases accuracy to 86.3%. On our suicide notes dataset, the best classical technique—the Naíve Bayes Bernoulli classifier, achieves accuracy of 71.3%, while our classifier, incorporating semantic and sentiment information, exceeds that at 75% accuracy.

Highlights

  • Text classification is the task of organizing text documents into pre-defined categories [1]

  • Using one-hot vectors matched that accuracy at 80.6%, showing that long short-term memory (LSTM) managed to find patterns in the very sparse sequence of vectors

  • Using LSTM with GloVe vectors resulted in better accuracy, at 85.3%, than the best traditional classifier

Read more

Summary

Introduction

Text classification is the task of organizing text documents into pre-defined categories [1]. It is an important aspect of data processing to make data usable by humans and is used in spam filtering [2], language identification [3], sentiment analysis [4] and many other areas. The Naıve Bayes classifiers use the Bayes rule in computing the probability that a document belongs to a class. They assume that occurrence of each word is independent of other PðCkjw1; . Stemming may use heuristics, such as in Porter Stemmer [23], or morphology, such as in the Morphy class in Stanford Natural Language Processing library [24]

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.