Urdu Sentiment Analysis

Iffraah Rehman,Tariq Rahim Soomro

doi:10.2478/acss-2022-0004

Abstract

Abstract The world is heading towards more modernized and digitalized data and therefore a significant growth is observed in the active number of social media users with each passing day. Each post and comment can give an insight into valuable information about a certain topic or issue, a product or a brand, etc. Similarly, the process to uncover the underlying information from the opinion that a person keeps about any entity is called a sentiment analysis. The analysis can be carried out through two main approaches, i.e., either lexicon-based or machine learning algorithms. A significant amount of work in the different domains has been done in numerous languages for sentiment analysis, but minimal research has been conducted on the national language of Pakistan, which is Urdu. Twitter users who are familiar with Urdu update the tweets in two different textual formats either in Urdu Script (Nastaleeq) or in Roman Urdu. Thus, the paper is an attempt to perform the sentiment analysis on the Urdu language by extracting the tweets (Nastaleeq and Roman Urdu both) from Twitter using Tweepy API. A machine learning-based approach has been adopted for this study and the tool opted for the purpose is WEKA. The best algorithm was identified based on evaluation metrics, which comprise the number of correctly and incorrectly classified instances, accuracy, precision, and recall. SMO was found to be the most suitable machine learning algorithm for performing the sentiment analysis on Urdu (Nastaleeq) tweets, while the Roman Urdu Random Forest algorithm was identified as the best one.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Computer Systems	Publication Date: Jun 1, 2022
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Urdu Sentiment Analysis

Abstract

Talk to us

Similar Papers

More From: Applied Computer Systems

Lead the way for us

Similar Papers

Automatic Detection of Offensive Language for Urdu and Roman Urdu
Muhammad Pervez Akhter ... Muhammad Tariq Sadiq
IEEE Access | VOL. 8
Muhammad Pervez Akhter, et. al.Muhammad Pervez Akhter ... Muhammad Tariq Sadiq
01 Jan 2020
IEEE Access | VOL. 8

RUTUT: Roman Urdu to Urdu Translator Based on Character Substitution Rules and Unicode Mapping
Mobeen Shahroz ... Gyu Sang Choi
IEEE Access | VOL. 8
Mobeen Shahroz, et. al.Mobeen Shahroz ... Gyu Sang Choi
01 Jan 2020
IEEE Access | VOL. 8

Pseudo Transfer Learning by Exploiting Monolingual Corpus: An Experiment on Roman Urdu Transliteration
Muhammad Yaseen Khan ... Tafseer Ahmed
-
Muhammad Yaseen Khan, et. al.Muhammad Yaseen Khan ... Tafseer Ahmed
01 Jan 2020
01 Jan 2020

Discourse Based Opinion Mining on Roman Urdu Data
Zareen Sharaf ... Husnain Manzoor Ali
Journal of Independent Studies and Research Computing | VOL. 17
Zareen Sharaf, et. al.Zareen Sharaf ... Husnain Manzoor Ali
01 Jan 2019
Journal of Independent Studies and Research Computing | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Urdu Sentiment Analysis

Abstract

Talk to us

Similar Papers

More From: Applied Computer Systems