Robust Identification of Figurative Language in Personal Health Mentions on Twitter

Usman Naseem,Matloob Khushi,Jinman Kim,Adam G. Dunn

doi:10.1109/tai.2022.3175469

Abstract

People often discuss their health on social media platforms. Discussion of personal experiences with diseases and symptoms can be useful in public health applications like adverse event surveillance. A major challenge comes from the need to distinguish personal health mentions from other uses of those terms, including figurative use, where words are used to mean something different. Public health applications require the separation of personal health mentions from other uses. Prior approaches incorporate some elements of context but could be improved to capture relationships between the linguistic characteristics of figurative expressions and the representations of the context. In this work, we investigate the role of context representation for identifying personal health mentions on social media and measure the impact of different representation choices on detecting figurative use of a range of disease and symptom words. We present an end-to-end approach that selects representations adaptively for different disease or symptom words. We conduct experiments using a publicly available health-mention dataset, annotated with ten disease or symptom labels. The results demonstrate that our approach outperforms the state of the art (SOTA) in the identification of figurative language use across a range of disease or symptom words, with an F1-score of 0.925 (an increase of 10.7% over the SOTA) and the proportion of correctly identified figurative mentions was 0.923 (an increase of 16.7% over the SOTA). An ablation analysis demonstrates that each of the new modules contributes to this increased performance.

Full Text