A Hybrid Approach for the Sentiment Analysis of Turkish Twitter Data

H A Shehu,S Tokat

doi:10.1007/978-3-030-36178-5_15

Abstract

Social media is now playing an important role in influencing people’s sentiments. It also helps analyze how people, particularly consumers, feel about a particular topic, product or an idea. One of the recent social media platforms that people use to express their thoughts is Twitter. Due to the fact that Turkish is an agglutinative language, its complexity makes it difficult for people to perform sentiment analysis. In this study, a sum of 13K Turkish tweets has been collected from Twitter using the Twitter API and their sentiments are being analyzed using machine learning classifiers. Random forests and support vector machines are the two kinds of classifiers that are adopted. Preprocessing methods were applied on the obtained data to remove links, numbers, punctuations and un-meaningful characters. After the preprocessing phase, unsuitable data have been removed and 10,500 out of the 13K downloaded dataset are taken as the main dataset. The datasets are classified to be either positive, negative or neutral based on their contents. The main dataset was converted to a stemmed dataset by removing stopwords, applying tokenization and also applying stemming on the dataset, respectively. A portion of 3,000 and 10,500 of the stemmed data with equal distribution from each class has been identified as the first dataset and second dataset to be used in the testing phase. Experimental results have shown that while support vector machines perform better when it comes to classifying negative and neutral stemmed data, random forests algorithm perform better in classifying positive stemmed data and thus a hybrid approach which consists of the hierarchical combination of random forest and support vector machines has also been developed and used to find the result of the data. Finally, the applied methodologies have been tested on both the first and the second dataset. It has been observed that while both support vector machines and random forest algorithms could not achieve an accuracy of up to 77% on the first and 72% on the second dataset, the developed hybrid approach achieve an accuracy of up to 86.4% and 82.8% on the first and second dataset, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Hybrid Approach for the Sentiment Analysis of Turkish Twitter Data

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

The Classification Performance and Mechanism of Machine Learning Algorithms in Winter Wheat Mapping Using Sentinel-2 10 m Resolution Imagery
Peng Fang ... Yuanzheng Wang
Applied Sciences | VOL. 10
Peng Fang, et. al.Peng Fang ... Yuanzheng Wang
23 Jul 2020
Applied Sciences | VOL. 10

Sentiments Analysis on Public Land Transport Infrastructure in Davao Region using Machine Learning Algorithms: A Recent Study
Mark Van M Buladaco ... Laarni M Cantero
-
Mark Van M Buladaco, et. al.Mark Van M Buladaco ... Laarni M Cantero
30 Aug 2021
30 Aug 2021

Use of machine learning-based classification algorithms in the monitoring of Land Use and Land Cover practices in a hilly terrain.
Deepanshu Parashar ... Ajit Pratap Singh
Environmental Monitoring and Assessment | VOL. 196
Deepanshu Parashar, et. al.Deepanshu Parashar ... Ajit Pratap Singh
05 Dec 2023
Environmental Monitoring and Assessment | VOL. 196

Analysis of Cyberbullying on Social Media Using A Comparison of Naïve Bayes, Random Forest, and SVM Algorithms
M Salim ... Sulistiawati Rahayu Ahmad
Jurnal Teknologi Informasi dan Pendidikan | VOL. 17
M Salim, et. al.M Salim ... Sulistiawati Rahayu Ahmad
11 Jan 2024
Jurnal Teknologi Informasi dan Pendidikan | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Hybrid Approach for the Sentiment Analysis of Turkish Twitter Data

Abstract

Talk to us

Similar Papers