Abstract

Leveraging multi-modal information sources has attracted the attention of researchers and practitioners for developing resources and technologies in the broad areas of applied Artificial Intelligence (AI). During the occurrence of a natural disasters, people heavily use social media for communication by posting multimedia information in the form of texts and images. In such a critical situations, it becomes imperative to use all modalities of information sources to better capture vital knowledge related to the crisis. In this paper, we propose an effective deep learning model to leverage multi-modal information sources in the form of both texts and images, and then disseminate useful information at the time of natural disasters. Our proposed model classifies the tweets into seven critical and potentially actionable categories, such as the reports of ‘injured or dead people’, ‘infrastructure damage’ etc. Experiments on a benchmark dataset show that fusion of multi-modal information sources, viz. text and image both, are more effective compared to uni-modal (i.e. either text or image) source in extracting meaningful information generated during disaster situations. By using information from both the modalities (text and image), we obtain a macro F1-Score of 0.51, which is a significant improvement over the baseline models that make use of only text or image for the classification. We supplement our results with a thorough analysis exploring the reasons for this phenomenon, thus, further demonstrating the utility of exploiting multiple modalities. The primary contribution of this paper lies in developing an attentive deep learning model that uses social media text and image to help classify images into crucial classes for disaster domain. The major findings of our research are that using textual features while classifying the corresponding images, improves the classification performance. We also explore different methods of fusion of multi-modal features and conclude that fusion through attention mechanism works best for image classification in disaster domain.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.