Engineering Deep Representations for Modeling Aesthetic Perception.

Yanxiang Chen,Chao Zhang,Ping Li,Yuxing Hu,Luming Zhang

doi:10.1109/tcyb.2017.2758350

Abstract

Many aesthetic models in multimedia and computer vision suffer from two shortcomings: 1) the low descriptiveness and interpretability 1 of the hand-crafted aesthetic criteria (i.e., fail to indicate region-level aesthetics) and 2) the difficulty of engineering aesthetic features adaptively and automatically toward different image sets. To remedy these problems, we develop a deep architecture to learn aesthetically relevant visual attributes from Flickr, 2 which are localized by multiple textual attributes in a weakly supervised setting. More specifically, using a bag-of-words representation of the frequent Flickr image tags, a sparsity-constrained subspace algorithm discovers a compact set of textual attributes (i.e., each textual attribute is a sparse and linear representation of those frequent image tags) for each Flickr image. Then, a weakly supervised learning algorithm projects the textual attributes at image-level to the highly-responsive image patches. These patches indicate where humans look at appealing regions with respect to each textual attribute, which are employed to learn the visual attributes. Psychological and anatomical studies have demonstrated that humans perceive visual concepts in a hierarchical way. Therefore, we normalize these patches and further feed them into a five-layer convolutional neural network to mimic the hierarchy of human perceiving the visual attributes. We apply the learned deep features onto applications like image retargeting, aesthetics ranking, and retrieval. Both subjective and objective experimental results thoroughly demonstrate the superiority of our approach.1 In this paper, "describing" and "interpretability" means the ability of seeking region-level representation of each mined textual attribute, i.e., a sparse and linear representation of those frequent image tags. 2 https://www.flickr.com/.

Full Text