Abstract

Emotion detection is a widely studied topic in natural language processing due to its significance in a number of application areas. A plethora of studies have been conducted on emotion detection in European as well as Asian languages. However, a large majority of these studies have been conducted in monolingual settings, whereas little attention has been paid to emotion detection in code-mixed text. Specifically, merely one study has been conducted on emotion detection inRoman Urdu (RU)andEnglish (EN)code-mixed text despite the fact that such text is widely used in social media platforms. A careful examination of the existing study has revealed several issues which justify that this area requires attention of researchers. For instance, more than 37% of the messages in the contemporary corpus are monolingual sentences representing that a purely code-mixed emotion analysis corpus is non-existent. To that end, this study has scrapped 400,000 sentences from three social media platforms to identify 20,000 RU-EN code-mixed sentences. Subsequently, an iterative approach is employed to develop emotion detection guidelines. These guidelines have been used to develop a large RU-EN emotion detection (RU-EN-Emotion) corpus in which 20,000 sentences are annotated as Neutral or Emotion-sentence. The sentences having emotions are further annotated with the respective emotions. Subsequently, 102 experiments are performed to evaluate the effectiveness of six classical machine learning techniques and six deep learning techniques. The results show, (a) CNN is the most effective technique when used with GloVe embeddings, and (b) our developed RU-EN-Emotion corpus is more useful than the contemporary corpus, as it employs a two-level classification approach.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.