Semantic Model Representation For Human's Pre-conceived Notions In Arabic Text With Applications To Sentiment Mining

M.e Ramy Georges Baly ,Hazem Hajj,Nizar Habash,Gilbert Badaro,Wassim El-Hajj ,Khaled Shaban

doi:10.5339/qfarc.2014.itpp1075

Abstract

Opinion mining is becoming of high importance with the availability of opinionated data on the Internet and the different applications it can be used for. Intensive efforts have been made to develop opinion mining systems, and in particular for the English language. However, models for opinion mining in Arabic remain challenging due to the complexity and rich morphology of the language. Previous approaches can be categorized into supervised approaches that use linguistic features to train machine learning classifiers, and unsupervised approaches that make use of sentiment lexicons. Different features have been exploited such as surface-based, syntactic, morphological, and semantic features. However, the semantic extraction remains shallow. In this paper, we propose to go deeper into the semantics of the text when considered for opinion mining. We propose a model that is inspired by the cognitive process that humans follow to infer sentiment, where humans rely on a database of preconceived notions developed throughout their life experiences. A key aspect for the proposed approach is to develop a semantic representation of the notions. This model consists of a combination of a set of textual representations for the notion (Ti), and a corresponding sentiment indicator (Si). Thus denotes the representation of a notion. However, notions can be constructed at different levels of text granularity ranging from ideas covered by words to ideas covered in full documents. The range also includes clauses, phrases, sentences, and paragraphs. To demonstrate the use of this new semantic model of preconceived notions, we develop the full representation of one-word notions by including the following set of syntactic features for Ti: word surfaces, stems, and lemmas represented by binary presence and TFIDF. We also include morphological features such as part of speech tags, aspect, person, gender, mood, and number. As for the notion sentiment indicator Si, we create a new set of features that indicate the words' sentiment scores based on an internally-developed Arabic sentiment lexicon called ArSenL, and using a third-party lexicon called Sifaat. The aforementioned features are extracted at the word-level, and are considered as raw features. We also investigate the use of additional features that reflect the aggregated semantics of a sentence. Such features are derived from word-level information, and include count of subjective words, average of sentiment scores per sentence. Experiments are conducted on a benchmark dataset collected from the Penn Arabic TreeBank (PATB) already annotated with sentiment labels. Results reveal that raw word-level features do not achieve satisfactory performance in sentiment classification. Feature reduction was also explored to evaluate the relative importance of the raw features, where the results showed low correlations between individual raw features and sentiment labels. On the other hand, the inclusion of engineered features had a significant impact on classification accuracy. The outcome of these experiments is a comprehensive set of features that reflect the one-word notion or idea representation in a human mind. The results from one-word also show promises towards higher level context with multi-word notions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Semantic Model Representation For Human's Pre-conceived Notions In Arabic Text With Applications To Sentiment Mining

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Automatic construction of target-specific sentiment lexicon
Sixing Wu ... Yongfeng Huang
Expert Systems with Applications | VOL. 116
Sixing Wu, et. al.Sixing Wu ... Yongfeng Huang
13 Sep 2018
Expert Systems with Applications | VOL. 116

Self co‐articulation detection and trajectory guided recognition for dynamic hand gestures
Joyeeta Singha ... Rabul Hussain Laskar
IET Computer Vision | VOL. 10
Joyeeta Singha, et. al.Joyeeta Singha ... Rabul Hussain Laskar
01 Mar 2016
IET Computer Vision | VOL. 10

Supplementary orthogonal cepstral features
K.T Assaleh
-
K.T AssalehK.T Assaleh
09 May 1995
09 May 1995

Exploration of social media for sentiment analysis using deep learning
Liang-Chu Chen ... Chia-Meng Lee
Soft Computing | VOL. 24
Liang-Chu Chen, et. al.Liang-Chu Chen ... Chia-Meng Lee
08 Oct 2019
Soft Computing | VOL. 24

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Semantic Model Representation For Human's Pre-conceived Notions In Arabic Text With Applications To Sentiment Mining

Abstract

Talk to us

Similar Papers