Abstract

This paper introduces a novel implementation of an automatic labeling technique, oriented to health-related Twitter annotation for three languages: English, French, and Arabic. Thus, sentiment analysis is performed. The presented technique relies on data preprocessing, allowing for automatic tweets annotation based on domain knowledge, Natural Language Processing (NLP), and sentiment-lexicon dictionaries. In order to conduct our experiments, we use Deep Learning technique for sentiment prediction. In particular, we implement a Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Long Short-Term Memory (LSTM). In training the model, we include both a domain-specific private dataset and a non-specific domain public dataset containing users’ large reviews from Amazon, IMDB and Yelp, and an Arabic Sentiment Tweets Dataset (ASTD). Our overall performance evaluation shows that LSTM-RNN outperforms the literature’s review for both English and Arabic datasets. It achieves an accuracy of 0.98, an F1-Score of 0.97, a precision of 0.98 and a recall of 0.97 on the English Twitter dataset; an accuracy of 0.92, an F1-Score of 0.91, a precision of 0.89 and a recall of 0.93 on the French Twitter dataset; and an accuracy of 0.83, an F1-Score of 0.82, a precision of 0.87 and a recall of 0.79 on the Arabic Twitter dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.