Abstract

Most opinion mining methods in English rely successfully on sentiment lexicons, such as English SentiWordnet (ESWN). While there have been efforts towards building Arabic sentiment lexicons, they suffer from many deficiencies: limited size, unclear usability plan given Arabic’s rich morphology, or nonavailability publicly. In this paper, we address all of these issues and produce the first publicly available large scale Standard Arabic sentiment lexicon (ArSenL) using a combination of existing resources: ESWN, Arabic WordNet, and the Standard Arabic Morphological Analyzer (SAMA). We compare and combine two methods of constructing this lexicon with an eye on insights for Arabic dialects and other low resource languages. We also present an extrinsic evaluation in terms of subjectivity and sentiment analysis.

Highlights

  • Opinion mining refers to the extraction of subjectivity and polarity from text (Pang and Lee, 2005)

  • We define our target Arabic Sentiment Lexicon as a resource, pairing Arabic lemmas used in the morphological analyzer Standard Arabic Morphological Analyzer (SAMA) with sentiment scores such as those used in English lexicon SentiWordnet (ESWN)

  • We rely on four existing resources to create Arabic sentiment lexicon (ArSenL): English WordNet (EWN), Arabic WordNet (AWN), English SentiWordNet (ESWN) and SAMA

Read more

Summary

Introduction

Opinion mining refers to the extraction of subjectivity and polarity from text (Pang and Lee, 2005). Some opinion mining methods in English rely on the English lexicon SentiWordnet (ESWN) (Esuli and Sebastiani, 2006; Baccianella et al, 2010) for extracting word-level sentiment polarity. We propose to address these challenges, and create a large-scale sentiment lexicon benefiting from available Arabic lexica. One lexicon is created by matching Arabic WordNet (AWN) (Black et al, 2006) to ESWN This path relies on the existence of a wordnet, a rather expensive resource; while the second lexicon is developed by matching lemmas in the SAMA (Graff et al, 2009) lexicon to ESWN directly.

Literature Review
Approaches to Lexicon Creation
Resources
Arabic WordNet-based Approach
English Gloss-based Approach
Combining the Two Approaches
Evaluation
Results
Conclusion and Future Work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call