Abstract

Sarcasm is a sophisticated construct to express contempt or ridicule. It is well-studied in multiple disciplines (e.g., neuroanatomy and neuropsychology) but is still in its infancy in computational science (e.g., Twitter sarcasm detection). In contrast to previous methods that are usually geared toward a single discipline, we focus on the multidisciplinary cross-innovation, i.e., improving embryonic sarcasm detection in computational science by leveraging the advanced knowledge of sarcasm cognition in neuroanatomy and neuropsychology. In this work, we are oriented toward sarcasm detection in social media and correspondingly propose a multimodal, multi-interactive, and multihierarchical neural network ( M3N2 ). We select Twitter, image, text in image, and image caption as the input of M3N2 since the brain's perception of sarcasm requires multiple modalities. To reasonably address the multimodalities, we introduce singlewise, pairwise, triplewise, and tetradwise modality interactions incorporating gate mechanism and guide attention (GA) to simulate the interactions and collaborations of involved regions in the brain while perceiving multiple modes. Specifically, we exploit a multihop process for each modality interaction to extract modal information multiple times using GA for obtaining multiperspective information. Also, we adopt a two-hierarchical structure leveraging self-attention accompanied by attention pooling to integrate multimodal semantic information from different levels mimicking the brain's first- and second-order comprehensions of sarcasm. Experimental results show that M3N2 achieves competitive performance in sarcasm detection and displays powerful generalization ability in multimodal sentiment analysis and emotion recognition.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call