Abstract

In this paper we describe an automated approach to enriching sentiment analysis with idiom-based features. Specifically, we automated the development of the supporting lexico-semantic resources, which include (1) a set of rules used to identify idioms in text and (2) their sentiment polarity classifications. Our method demonstrates how idiom dictionaries, which are readily available general pedagogical resources, can be adapted into purpose-specific computational resources automatically. These resources were then used to replace the manually engineered counterparts in an existing system, which originally outperformed the baseline sentiment analysis approaches by 17 percentage points on average, taking the F-measure from 40s into 60s. The new fully automated approach outperformed the baselines by 8 percentage points on average taking the F-measure from 40s into 50s. Although the latter improvement is not as high as the one achieved with the manually engineered features, it has got the advantage of being more general in a sense that it can readily utilize an arbitrary list of idioms without the knowledge acquisition overhead previously associated with this task, thereby fully automating the original approach.

Highlights

  • FIGURATIVE language whose meaning differs from the literal interpretation poses significant challenges to natural language understanding

  • In a previous study we investigated the role of idioms in sentiment analysis [8], an important subarea of natural language understanding whose aim is to automatically interpret opinions, sentiments, attitudes and emotions expressed in written text [9]

  • We demonstrated that automatically engineered idiombased features improve sentiment analysis results

Read more

Summary

Introduction

FIGURATIVE language whose meaning differs from the literal interpretation poses significant challenges to natural language understanding. Idioms are considered to be one of the most prominent types of figurative language. Semantic non-compositionality and a degree of fixedness are often taken as key markers of idioms, e.g., [4], [5], [6]. This definition chimes with the popular understanding of the term idiom and works reasonably well for prototypical cases such as fly off the handle. A distinction is often made between idioms of encoding (where idiomatic knowledge is mainly required to produce an idiom, e.g., long time no see) and idioms of decoding (where idiomatic knowledge is required to understand an idiom, e.g., paint the town red) [7], with the latter being of primary interest for natural language understanding

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call