Abstract

Sentiment classification has become one of the most popular text classification domains especially in recent years. As it is valid for all text classification problems, high dimensionality of the feature space is one of the most important concerns for sentiment classification due to accuracy considerations. This study analyses the performance of six recent text feature selection methods for document level sentiment classification using two widely-known classifiers namely Support Vector Machines (SVM) and naive Bayes (NB). Three datasets including different types of sentiment data were utilized in the experiments. These datasets are named as Cornell movie review, Sentiment140, and Nine public sentiment. For evaluation, two different success measures namely Micro-F1 and Macro-F1 were used. Also, 3-fold cross-validation is preferred for a fair system performance evaluation. Experiments indicated that distinguishing feature selector (DFS) and discriminative features selection (DFSS) methods are superior to the other four feature selection methods for sentiment classification. The highest classification performances with SVM classifier were obtained when it is combined with DFSS feature selection method in general. On the other hand, highest classification performances with NB classifier were obtained when it is combined with DFS feature selection method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.