Generating paraphrase sentences for multimodal entity-category-sentiment triple extraction

Li Yang,Jieming Wang,Jin-Cheon Na,Jianfei Yu

doi:10.1016/j.knosys.2023.110823

Abstract

Multimodal entity-based sentiment analysis (MEBSA) is an emerging task in sentiment analysis that aims to identify three key elements (entity, entity category, and sentiment polarity) from a pair of sentence and image. However, most existing studies have primarily focused on one or two MEBSA subtasks, ignoring the fact that these subtasks are closely related with one another. Moreover, previous studies focused on the detection of coarse-grained entity categories, which failed to provide sufficient information to disambiguate entities. To address these two issues, we introduced a new task called multimodal entity-category-sentiment triple extraction (MECSTE) to extract entities, their corresponding fine-grained entity categories, and sentiment polarities simultaneously. We constructed two datasets for this task based on two existing Twitter corpora. Moreover, we developed a generative multimodal approach based on a pre-trained sequence-to-sequence model that formulates the MECSTE task as a paraphrase generation problem by linearising all the entity-category-sentiment triples into a natural language sentence. Extensive experiments on the annotated Twitter datasets demonstrate the superiority of the proposed method.

Full Text