Cognitive computing is an interdisciplinary research field that simulates human thought processes in a computerized model. One application for cognitive computing is sentiment analysis on online reviews, which reflects opinions and attitudes toward products and services experienced by consumers. A high level of classification performance facilitates decision making for both consumers and firms. However, while much effort has been made to propose advanced classification algorithms to improve the performance, the importance of the textual quality of the data has been ignored. This research explores the impact of two influential textual features, namely the word count and review readability, on the performance of sentiment classification. We apply three representative deep learning techniques, namely SRN, LSTM, and CNN, to sentiment analysis tasks on a benchmark movie reviews dataset. Multiple regression models are further employed for statistical analysis. Our findings show that the dataset with reviews having a short length and high readability could achieve the best performance compared with any other combinations of the levels of word count and readability and that controlling the review length is more effective for garnering a higher level of accuracy than increasing the readability. Based on these findings, a practical application, i.e., a text evaluator or a website plug-in for text evaluation, can be developed to provide a service of review editorials and quality control for crowd-sourced review websites. These findings greatly contribute to generating more valuable reviews with high textual quality to better serve sentiment analysis and decision making.
Read full abstract