Abstract

The information momentum available on social media is an appropriate environment for identifying users' reactions and attitudes towards a particular topic, products, or any issues. To analyze this data and extract useful information, machine learning algorithms are used to categorize data into predefined categories. Analyzing data in the Arabic language is a challenge, and few studies focus on Arabic text mining. This paper focuses on sentiment analysis of Arabic tweets, in which, it conducts a performance comparison between three machine learning classifiers; Logistic Regression (LR), K-Nearest Neighbors (KNN) and Decision Tree (DT). Four Arabic text datasets are used in the experiments to evaluate the performance of the classifiers. For comparing purpose, we used four evaluation metrics: recall, precision, f-measure, and accuracy. The results show that the Logistic Regression achieves a better accuracy rate in the case of large datasets (93%) compared with the other classifiers. LR showed more improvement by increasing the volume of data, unlike other classifiers that recorded a noticeable decrease in accuracy in the last database (74% for KNN and DT when applying on 100K reviews dataset). Also, KNN and LR classifiers outperform DT classifier when applying them on small datasets such as AJGT and ASTD datasets.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.