Abstract

Fine art paintings take an important place in human history and are one of the fundamental components of human culture. Even though deep learning in fine art paintings attracts increasing attention from researchers, few works focus on understanding the interplay between the visual content, its triggered affect, and language aiming to explain that affect. Most visual captioning models can only deal with tasks of generating captions describing objective affairs but lack the capabilities of generating affective explanations. In this paper, we introduce the use of the VAD (Valence, Arousal, and Dominance) dictionary in our model and propose a gated concatenation mechanism to construct word affective embedding. Corporating with the use of the affective loss function, our model outperforms the state-of-the-art in automatic evaluation metrics and subjective evaluations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call