Abstract
Abstract The Multimodal Named Entity Recognition (MNER) task enhances the text representations and improves the accuracy and robustness of named entity recognition by leveraging visual information from images. However, previous methods have two limitations: (i) the semantic mismatch between text and image modalities makes it challenging to establish accurate internal connections between words and visual representations. Besides, the limited number of characters in social media posts leads to semantic and contextual ambiguity, further exacerbating the semantic mismatch between modalities. (ii) Existing methods employ cross-modal attention mechanisms to facilitate interaction and fusion between different modalities, overlooking fine-grained correspondences between semantic units of text and images. To alleviate these issues, we propose a graph neural network approach for MNER (GNN-MNER), which promotes fine-grained alignment and interaction between semantic units of different modalities. Specifically, to mitigate the issue of semantic mismatch between modalities, we construct corresponding graph structures for text and images, and leverage graph convolutional networks to augment text and visual representations. For the second issue, we propose a multimodal interaction graph to explicitly represent the fine-grained semantic correspondences between text and visual objects. Based on this graph, we implement deep-level feature fusion between modalities utilizing graph attention networks. Compared with existing methods, our approach is the first to extend graph deep learning throughout the MNER task. Extensive experiments on the Twitter multimodal datasets validate the effectiveness of our GNN-MNER.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.