Abstract

YouTube is a boon, and through it people can educate, entertain, and express themselves about various topics. YouTube India currently has millions of active users. As there are millions of active users it can be understood that the data present on the YouTube will be large. With India being a very diverse country, many people are multilingual. People express their opinions in a code-mix form. Code-mix form is the mixing of two or more languages. It has become a necessity to perform Sentiment Analysis on the code-mix languages as there is not much research on Indian code-mix language data. In this paper, Sentiment Analysis (SA) is carried out on the Marglish (Marathi + English) as well as Devanagari Marathi comments which are extracted from the YouTube API from top Marathi channels. Several machine-learning models are applied on the dataset along with 3 different vectorizing techniques. Multilayer Perceptron (MLP) with Count vectorizer provides the best accuracy of 62.68% on the Marglish dataset and Bernoulli Naïve Bayes along with the Count vectorizer, which gives accuracy of 60.60% on the Devanagari dataset. Multilayer Perceptron and Bernoulli Naïve Bayes are considered to be the best performing algorithms. 10-fold cross-validation and statistical testing was also carried out on the dataset to confirm the results.

Highlights

  • According to the World Bank, the number of users using the Internet in India are rising exponentially, from 2010

  • This paper aims to take up the challenge to perform the Sentiment Analysis (SA) on Marglish language on cookery channels

  • Count Vectorizer with the Bernoulli Naïve Bayes (BNB) gives the best results for the Devanagari dataset

Read more

Summary

Introduction

According to the World Bank, the number of users using the Internet in India are rising exponentially, from 2010. 16% watch videos, 15% use search engines Based on these statistics it is evident that a large number of people are using Facebook, Twitter, WhatsApp, Instagram (social media platforms) and YouTube (to watch videos). According to [3] YouTube India already has 245 million active users in India and it reaches 80–85% of the Internet-using population. After the Internet traffic rates became cheaper in India in 2016 there was a huge demand for the YouTube videos in Indian Languages. The genres such as comedy, food, travel etc. The genres such as comedy, food, travel etc. are becoming people’s choice and the demands for them is on rising day by day

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.