Abstract

Social media has become one of the most popular sources of information. People communicate with each other and share their ideas, commenting on global issues and events in a multilingual environment. While social media has been popular for several years, recently, it has given an exponential rise in online data volumes because of the increasing popularity of local languages on the web. This allows researchers of the NLP community to exploit the richness of different languages while overcoming the challenges posed by these languages. Urdu is also one of the most used local languages being used on social media. In this paper, we presented the first-ever event detection approach for Urdu language text. Multiclass event classification is performed by popular deep learning (DL) models, i.e.,Convolution Neural Network (CNN), Recurrence Neural Network (RNN), and Deep Neural Network (DNN). The one-hot-encoding, word embedding, and term-frequency inverse document frequency- (TF-IDF-) based feature vectors are used to evaluate the Deep Learning(DL) models. The dataset that is used for experimental work consists of more than 0.15 million (103965) labeled sentences. DNN classifier has achieved a promising accuracy of 84% in extracting and classifying the events in the Urdu language script.

Highlights

  • In the current digital era, social media dominated other sources of communication, i.e., print and broadcast media [1]

  • Instead of using the joint framework of convolutional neural network (CNN) and recurrent neural network (RNN) for sentiment analysis [35], we evaluated the performance of deep learning models for multiclass event classification

  • We analyzed the performance of deep learning, i.e., deep neural network, convolutional neural network, and recurrent neural network, along with other machine learning classifiers, i.e., K-nearest neighbor, decision tree, random forest, support vector machine, Naıve Bayes multinominal, and linear regression

Read more

Summary

Introduction

In the current digital era, social media dominated other sources of communication, i.e., print and broadcast media [1]. E usage of local languages on social media is overwhelming for the last few years. [4] in the world via social media using local languages. A considerable amount of heterogeneous data is being generated which causes challenges to extract worthy insights, while this information plays a vital role in developing natural language processing (NLP) application, i.e., sentiment analysis [5], risk factor analysis [6], law and order predictor, timeline constructor, opining mining, decision-making systems [7], monitoring social media [8], spam detection, information retrieval, document classification [9], e-mail categorization [10], and sentence classification [11], topic modeling [12], content labeling, and finding the latest trend.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.