Abstract

In this paper we demonstrate approaches for opinion mining in Latvian text. Authors have applied, combined and extended results of several previous studies and public resources to perform opinion mining in Latvian text using two approaches, namely, semantic polarity analysis and machine learning. One of the most significant constraints that make application of opinion mining for written content classification in Latvian text challenging is the limited publicly available text corpora for classifier training. We have joined several sources and created a publically available extended lexicon. Our results are comparable to or outperform current achievements in opinion mining in Latvian. Experiments show that lexicon-based methods provide more accurate opinion mining than the application of Naive Bayes machine learning classifier on Latvian tweets. Methods used during this study could be further extended using human annotators, unsupervised machine learning and bootstrapping to create larger corpora of classified text.

Highlights

  • Growth of the amount of information accessible through the web and especially through social networks on the one hand and comparatively limited capabilities of the human mind to perceive and process large amounts of information on the other hand have raised the demand for ways to interpret this information automatically

  • Experiments show that lexicon-based methods provide more accurate opinion mining than the application of Naive Bayes machine learning classifier on Latvian tweets

  • Based on a hypothesis that methods, problem areas and results obtained regarding opinion mining in languages other than English could be relevant to opinion mining in Latvian, we reviewed multiple non-English opinion mining studies that have been performed in recent years, focusing on opinion mining of texts in Twitter social network

Read more

Summary

Introduction

Growth of the amount of information accessible through the web and especially through social networks on the one hand and comparatively limited capabilities of the human mind to perceive and process large amounts of information on the other hand have raised the demand for ways to interpret this information automatically. The majority of work and especially tools for opinion mining are focused on widely used languages such as English. Authors of [1] performed a comprehensive study on opinion mining algorithms and applications, reviewing 54 sources, and stated that “the interest in languages other than English in this field is growing as there is still a lack of resources and researches concerning these languages”. A typical issue for less widely spoken languages is the availability of text resources for creating automated opinion mining systems. Developing these resources and applying different method combinations for opinion mining in Latvian is the scope of our research

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call