Abstract
With the proliferation and ubiquity of smart gadgets and smart devices, across the world, data generated by them has been growing at exponential rates; in particular social media platforms like Facebook, Twitter and Instagram have been generating voluminous data on a daily basis. According to Twitter’s usage statistics, about 500 million tweets are generated each day. While the tweets reflect the users’ opinions on several events across the world, there are tweets which are offensive in nature that need to be tagged under the hateful conduct policy of Twitter. Offensive tweets have to be identified, captured and processed further, for a variety of reasons, which include i) identifying offensive tweets in order to prevent violent/abusive behavior in Twitter (or any social media for that matter), ii) creating and maintaining a history of offensive tweets for individual users (would be helpful in creating meta-data for user profile), iii) inferring the sentiment of the users on particular event/issue/topic . We have employed neural network models which manipulate attention with Temporal Convolutional Neural Network for the three shared sub-tasks i) ATT-TCN (ATTention based Temporal Convolutional Neural Network) employed for shared sub-task A that yielded a best macro-F1 score of 0.46, ii) SAE-ATT-TCN(Self Attentive Embedding-ATTention based Temporal Convolutional Neural Network) employed for shared sub-task B and sub-task C that yielded best macro-F1 score of 0.61 and 0.51 respectively. Among the two variants ATT-TCN and SAE-ATT-TCN, the latter performed better.
Highlights
In the prevailing digital era, Deep Learning has penetrated almost all industry verticals and afforded several researchers an effective tool, in handling voluminous data and deriving meaningful inferences
(LeCun et al, 1998) invented Convolutional Neural Network (CNN) model for extraction of local features, which later proved to be the standard choice for Computer Vision tasks. (Hochreiter and Schmidhuber, 1997) the introduced LSTM (Long Short Term Memory) architecture, which went on to become the standard choice for Natural Language Processing tasks due to the implicit ordering of the sequence data in words and sentences
The neural network model generation, fine-tuning and the evaluation test data prediction, all the activities have been executed in Google Colaboratory environment, utilizing the on hand GPU hardware accelerator
Summary
In the prevailing digital era, Deep Learning has penetrated almost all industry verticals and afforded several researchers an effective tool, in handling voluminous data and deriving meaningful inferences. (Hochreiter and Schmidhuber, 1997) the introduced LSTM (Long Short Term Memory) architecture, which went on to become the standard choice for Natural Language Processing (sequence) tasks due to the implicit ordering of the sequence data in words and sentences. Ever since social media has become ubiquitous there have been individuals who take gratuitous advantage of the anonymous nature of social media platforms, and engage themselves in rude and offensive communications. Such behaviour that prohibit free flow of communication and violate acceptable usage policy has necessitated to identify and capture the offensive posts, comments, etc., in order to prevent the dissemination of abusive behaviour in social media. The task includes three shared sub-tasks that include: i) Sub-Task A: Offensive language identification, ii) Sub-Task B: Offense type categorization and
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.