Sentiment analysis aims to capture the diverse sentiment information expressed by authors in given natural language texts, and it has been a core research topic in many artificial intelligence areas. The existing machine-learning-based sentiment analysis approaches generally focus on employing popular textual feature representation methods, e.g., term frequency-inverse document frequency (tf-idf), n-gram features, and word embeddings, to construct vector representations of documents. These approaches can model rich syntactic and semantic information, but they largely fail to capture the sentiment information that is central to sentiment analysis. To address this issue, we propose a quantum-inspired sentiment representation (QSR) model. This model can not only represent the semantic content of documents but also capture the sentiment information. Specifically, since adjectives and adverbs are good indicators of subjective expression, this model first extracts sentiment phrases that match the designed sentiment patterns based on adjectives and adverbs. Then, both single words and sentiment phrases in the documents are modeled as a collection of projectors, which are finally encapsulated in density matrices through maximum likelihood estimation. These density matrices successfully integrate the sentiment information into the representations of documents. Extensive experiments are conducted on two widely used Twitter datasets, which are the Obama-McCain Debate (OMD) dataset and the Sentiment140 Twitter dataset. The experimental results show that our model significantly outperforms a number of state-of-the-art baselines and demonstrate the effectiveness of the QSR model for sentiment analysis.
Read full abstract