Abstract

Images can convey intense affective experiences and affect people on an affective level. With the prevalence of online pictures and videos, evaluating emotions from visual content has attracted considerable attention. Affective image recognition aims to classify the emotions conveyed by digital images automatically. The existing studies using manual features or deep networks mainly focus on low-level visual features or high-level semantic representation without considering all factors. To better understand how deep networks are working for affective recognition tasks, we investigate the convolutional features by visualization them in this work. Our research shows that the hierarchical CNN model mainly relies on deep semantic information while ignoring the shallow visual details, which are essential to evoke emotions. To form a more general and discriminative representation, we propose a multi-level hybrid model that learns and integrates the deep semantics and shallow visual representations for sentiment classification. In addition, this study shows that class imbalance would affect performance as the main category of the affective dataset will overwhelm training and degenerate the deep networks. Therefore, a new loss function is introduced to optimize the deep affective model. Experimental results on several affective image recognition datasets show that our model outperforms various existing studies. The source code is publicly available.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call