Effectiveness of Word Embeddings on Classifiers: A Case Study with Tweets

Sukanya Manna,Haruto Nakai

doi:10.1109/icosc.2019.8665538

Abstract

Twitter is a popular micro-blogging platform that offers a rich source of real-time information about real-world events, particularly during mass emergencies/crises. During any crisis, it is necessary to filter through a huge amount of tweets within a short span of time to extract crisis related information. Different machine learning (ML) algorithms have been used to classify crisis related tweets from non crisis-related ones, and thus play a significant role in building an application for emergency management. With the proliferation of data, it becomes unmanageable to process the growing stream of information. So this paper focuses on (1) different Natural Language Processing (NLP) techniques to make tweets suitable for applying ML algorithms, (2) Different word-embeddings to create a more domain specific semantic space and address dimension reduction for efficiently analyzing tweets, (3) comparative analysis of different state-of-the-art ML algorithms (classifiers) which can be applied to categorize crisis-related tweets with a higher accuracy. The experiments have been done on six different crisis related datasets, each approximately consisting of 10,000 tweets. With our analysis, it is shown that Neural Networks have outperformed all other classifiers like Naive Bayes, Logistic Regression, and Support Vector Machines. Moreover, it is seen that if word-embedding models are trained with more domain specific data, they can even outperform the pre-trained models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Effectiveness of Word Embeddings on Classifiers: A Case Study with Tweets

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports.
Po-Hao Chen ... Maya Galperin-Aizenberg
Journal of Digital Imaging | VOL. 31
Po-Hao Chen, et. al.Po-Hao Chen ... Maya Galperin-Aizenberg
27 Oct 2017
Journal of Digital Imaging | VOL. 31

Comparative Analysis of Different Classifiers on Crisis-Related Tweets: An Elaborate Study
Sukanya Manna ... Haruto Nakai
-
Sukanya Manna, et. al.Sukanya Manna ... Haruto Nakai
04 Sep 2019
04 Sep 2019

User Interface Bug Classification Model Using ML and NLP Techniques: A Comparative Performance Analysis of ML Models
Sara Khan ... Saurabh Pal
International Journal of Experimental Research and Review | VOL. 45
Sara Khan, et. al.Sara Khan ... Saurabh Pal
30 Nov 2024
International Journal of Experimental Research and Review | VOL. 45

Prediction of outcome in malignant hypertension patients using machine learning: a report from the West Birmingham malignant hypertension registry
A Argyris ... G Lip
European Heart Journal | VOL. 45
A Argyris, et. al.A Argyris ... G Lip
28 Oct 2024
European Heart Journal | VOL. 45

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Effectiveness of Word Embeddings on Classifiers: A Case Study with Tweets

Abstract

Talk to us

Similar Papers