Abstract
Most opinion mining methods in English rely successfully on sentiment lexicons, such as English SentiWordnet (ESWN). While there have been efforts towards building Arabic sentiment lexicons, they suffer from many deficiencies: limited size, unclear usability plan given Arabic’s rich morphology, or nonavailability publicly. In this paper, we address all of these issues and produce the first publicly available large scale Standard Arabic sentiment lexicon (ArSenL) using a combination of existing resources: ESWN, Arabic WordNet, and the Standard Arabic Morphological Analyzer (SAMA). We compare and combine two methods of constructing this lexicon with an eye on insights for Arabic dialects and other low resource languages. We also present an extrinsic evaluation in terms of subjectivity and sentiment analysis.
Highlights
Opinion mining refers to the extraction of subjectivity and polarity from text (Pang and Lee, 2005)
We define our target Arabic Sentiment Lexicon as a resource, pairing Arabic lemmas used in the morphological analyzer Standard Arabic Morphological Analyzer (SAMA) with sentiment scores such as those used in English lexicon SentiWordnet (ESWN)
We rely on four existing resources to create Arabic sentiment lexicon (ArSenL): English WordNet (EWN), Arabic WordNet (AWN), English SentiWordNet (ESWN) and SAMA
Summary
Opinion mining refers to the extraction of subjectivity and polarity from text (Pang and Lee, 2005). Some opinion mining methods in English rely on the English lexicon SentiWordnet (ESWN) (Esuli and Sebastiani, 2006; Baccianella et al, 2010) for extracting word-level sentiment polarity. We propose to address these challenges, and create a large-scale sentiment lexicon benefiting from available Arabic lexica. One lexicon is created by matching Arabic WordNet (AWN) (Black et al, 2006) to ESWN This path relies on the existence of a wordnet, a rather expensive resource; while the second lexicon is developed by matching lemmas in the SAMA (Graff et al, 2009) lexicon to ESWN directly.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have