Abstract

Document-level sentiment analysis is a challenging task given the large size of the text, which leads to an abundance of words and opinions, at times contradictory, in the same document. This analysis is particularly useful in analyzing press articles and blog posts about a particular product or company, and it requires a high concentration, especially when the topic being discussed is sensitive. Nevertheless, most existing models and techniques are designed to process short text from social networks and collaborative platforms. In this paper, we propose a combination of Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) models, with Doc2vec embedding, suitable for opinion analysis in long texts. The CNN-BiLSTM model is compared with CNN, LSTM, BiLSTM and CNN-LSTM models with Word2vec/Doc2vec embeddings. The Doc2vec with CNN-BiLSTM model was applied on French newspapers articles and outperformed the other models with 90.66% accuracy.

Highlights

  • Opinion or sentiment analysis [1] is a set of linguistic operations belonging to the automatic processing of natural language that apply to digital texts, namely publications and comments from social networks, as well as press articles

  • The Convolutional Neural Networks (CNN)-Bidirectional Long Short-Term Memory (BiLSTM) model achieved an accuracy of 90.66% (Figure 3)

  • The comparison of the performance of different deep learning models confirmed the interest of the CNN-BiLSTM with Doc2vec, the pre-trained model of sentence/paragraph representation (Table 4, the highest value is highlighted in red)

Read more

Summary

Introduction

Opinion or sentiment analysis [1] is a set of linguistic operations belonging to the automatic processing of natural language that apply to digital texts, namely publications and comments from social networks, as well as press articles. Its objective is to identify the sentiment expressed in the text and to predict its polarity (positive or negative) towards a given subject [2]. This analysis is very useful; especially with the emergence of social networks, people start to express their views and in the shortest time, which makes the manual processing of this huge number of opinions very difficult. Opinion analysis can be applied on different levels of granularity, namely: Word level is the analysis that determines the polarity of a word, i.e., if it is a positive, negative or neutral word. It is a more difficult level compared to the others, because, when the number of words increases, noise words increase, which distorts learning and complicates the prediction of polarity

Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.