Abstract

Sentiment analysis is a circumstantial analysis of text, identifying the social sentiment to better understand the source material. The article addresses sentiment analysis of an English-Hindi and English-Bengali code-mixed textual corpus collected from social media. Code-mixing is an amalgamation of multiple languages, which previously mainly was associated with spoken language. However, social media users also deploy it to communicate in ways that tend to be somewhat casual. The coarse nature of social media text poses challenges for many language processing applications. Here, the focus is on the low predictive nature of traditional machine learners when compared to Deep Learning counterparts, including the contextual language representation model BERT (Bidirectional Encoder Representations from Transformers), on the task of extracting user sentiment from code-mixed texts. Three deep learners (a BiLSTM CNN, a Double BiLSTM and an Attention-based model) attained accuracy 20–60% greater than traditional approaches on code-mixed data, and were for comparison also tested on monolingual English data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.