Abstract

Compared with conventional word embeddings, sentiment embeddings can distinguish words with similar contexts but opposite sentiment. They can be used to incorporate sentiment information from labeled corpora or lexicons by either end-to-end training or sentiment refinement. However, these methods present two major limitations. First, traditional approaches provide a fixed representation to each word but ignore the alternation of word meaning in different contexts. As a result, the polarity of a certain emotional word may vary with context, but will be assigned with a same representation. Another problem is the handling of out-of-vocabulary (OOV) or informal-writing sentiment words that would be assigned generic vectors (e.g., <UNK>). In addition, if affective words are not included in affective corpora or lexicons, they would be treated as neutral. Using such low-quality embeddings for building a neural model will reduce performance. This study proposes a training model of contextual sentiment embeddings. A stacked two-layer GRU model was used as the language model, simultaneously trained to incorporate semantic and sentiment information from labeled corpora and lexicons. To deal with OOV or informal-writing sentiment words, the WordPiece tokenizer was used to divide the text into subwords. The resulting model can be transferred to downstream applications by either feature extractor or fine-tuning. The results show that the proposed model can handle unseen or informal writing sentiment words and thus outperforms previously proposed methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.