Abstract

In recent years, sentiment analysis (SA) has raised the interest of researchers in several domains, including higher education. It can be applied to measure the quality of the services supplied by the higher education institution and construct a university ranking mechanism from social media like Twitter. Hence, this study presents a novel system for Twitter sentiment prediction on Moroccan public universities in real-time. It consists of two phases: offline sentiment analysis phase and real-time prediction phase. In the offline phase, the collected French tweets about twelve Moroccan universities were classified according to their sentiment into ‘positive’, ‘negative’, or ‘neutral’ using six machine learning algorithms (random forest, multinomial Naive Bayes classifier, logistic regression, decision tree, linear support vector classifier, and extreme gradient boosting) with the term frequency-inverse document frequency (TF-IDF) and count vectorizer feature extraction techniques. The results reveal that random forest classifier coupled with TF-IDF has obtained the best test accuracy of 90%. This model was then applied on real-time tweets. The real-time prediction pipeline comprises Twitter streaming API for data collection, Apache Kafka for data ingestion, Apache Spark for real-time sentiment analysis, Elasticsearch for real-time data exploration, and Kibana for data visualization. The obtained results can be used by the Ministry of higher education, scientific research and innovation of Morocco for decision-making process.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.