Abstract

Analyzing emotional information of visual content has attracted growing attention for the tendency of internet users to share their feelings via images and videos online. In this paper, we investigate the problem of affective image analysis, which is very challenging due to its complexity and subjectivity. Previous research reveals that image emotion is related to low-level to high-level visual features from both global and local view, while most of the current approaches only focus on improving emotion recognition performance based on single-level visual features from a global view. Aiming to utilize different levels of visual features from both global and local view, we propose a multi-level region-based Convolutional Neural Network (CNN) framework to discover the sentimental response of local regions. We first employ Feature Pyramid Network (FPN) to extract multi-level deep representations. Then, an emotional region proposal method is used to generate proper local regions and remove excessive non-emotional regions for image emotion classification. Finally, to deal with the subjectivity in emotional labels, we propose a multi-task loss function to take the probabilities of images belonging to different emotion classes into consideration. Extensive experiments show that our method outperforms the state-of-the-art approaches on various commonly used benchmark datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call