Abstract

Opinion mining in a multilingual and multi-domain environment as YouTube requires models to be robust across domains as well as languages, and not to rely on linguistic resources (e.g. syntactic parsers, POS-taggers, pre-defined dictionaries) which are not always available in many languages. In this work, we i) proposed a convolutional N-gram BiLSTM (CoNBiLSTM) word embedding which represents a word with semantic and contextual information in short and long distance periods; ii) applied CoNBiLSTM word embedding for predicting the type of a comment, its polarity sentiment (positive, neutral or negative) and whether the sentiment is directed toward the product or video; iii) evaluated the efficiency of our model on the SenTube dataset, which contains comments from two domains (i.e. automobile, tablet) and two languages (i.e. English, Italian). According to the experimental results, CoNBiLSTM generally outperforms the approach using SVM with shallow syntactic structures (STRUCT) – the current state-of-the-art sentiment analysis on the SenTube dataset. In addition, our model achieves more robustness across domains than the STRUCT (e.g. 7.47% of the difference in performance between the two domains for our model vs. 18.8% for the STRUCT)

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call