Abstract

The emotion and sentiment associated with comic scenes can provide potential information for inferring the context of comic stories, which is an essential pre-requisite for developing comics’ automatic content understanding tools. Here, we address this open area of comic research by exploiting the multi-modal nature of comics. The general assumptions for multi-modal sentiment analysis methods are that both image and text modalities are always present at the test phase. However, this assumption is not always satisfied for comics since comic characters’ facial expressions, gestures, etc., are not always clearly visible. Also, the dialogues between comic characters are often challenging to comprehend the underlying context. To deal with these constraints of comic emotion analysis, we propose a multi-task-based framework, namely EmoComicNet, to fuse multi-modal information (i.e., both image and text) if it is available. However, the proposed EmoComicNet is designed to perform even when any modality is weak or completely missing. The proposed method potentially improves the overall performance. Besides, EmoComicNet can also deal with the problem of weak or absent modality during the training phase.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.