Abstract

Regional language contents are the key to globalization of any successful internet based business model. Looking at the huge population interested in accessing the internet using their mother tongue or regional language is the new normal. This regional language contents on social media and word wide web pages fetched the attention of a large chunk of business analysts, data scientists and social reformists to understand the regional language sentiments through this humongous amount of regional language opinionated text. Regional Language Sentiment Analysis or Marathi language sentiment Analysis will be possible if one can create a dataset which can face text analytics language challenges like uniformity, syntactic and semantic challenges of regional language. This study is a small attempt to create a basic dataset capable of facing future Regional Language Sentiment Analysis or Marathi Language Sentiment Analysis based on NLP and SA based algorithmic approaches. This study will try to generate a Marathi language dataset from social media opinionated text and web scraping of a Marathi language webpage. All the technical issues associated with generating regional language or Marathi language dataset will be recorded, rectified and relatively refined through rigorous iterations to make the dataset future ready Marathi language sentiment analysis. This study will try to understand the needs of Regional Sentiment analysis requirements in terms of dataset, the best suitable file structure and efficient way of creating and customizing the Marathi text dataset in order to make it Natural Language Processing (NLP) and Sentiment Analysis SA ready for future studies in continuation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call