Perception-Guided Multimodal Feature Fusion for Photo Aesthetics Assessment

Luming Zhang,Qi Tian,Yue Gao,Roger Zimmermann,Chao Zhang,Hanwang Zhang

doi:10.1145/2647868.2654903

Abstract

Photo aesthetic quality evaluation is a challenging task in multimedia and computer vision fields. Conventional approaches suffer from the following three drawbacks: 1) the deemphasized role of semantic content that is many times more important than low-level visual features in photo aesthetics; 2) the difficulty to optimally fuse low-level and high-level visual cues in photo aesthetics evaluation; and 3) the absence of a sequential viewing path in the existing models, as humans perceive visually salient regions sequentially when viewing a photo. To solve these problems, we propose a new aesthetic descriptor that mimics humans sequentially perceiving visually/semantically salient regions in a photo. In particular, a weakly supervised learning paradigm is developed to project the local aesthetic descriptors (graphlets in this work) into a low-dimensional semantic space. Thereafter, each graphlet can be described by multiple types of visual features, both at low-level and in high-level. Since humans usually perceive only a few salient regions in a photo, a sparsity-constrained graphlet ranking algorithm is proposed that seamlessly integrates both the low-level and the high-level visual cues. Top-ranked graphlets are those visually/semantically prominent graphlets in a photo. They are sequentially linked into a path that simulates the process of humans actively viewing. Finally, we learn a probabilistic aesthetic measure based on such actively viewing paths (AVPs) from the training photos that are marked as aesthetically pleasing by multiple users. Experimental results show that: 1) the AVPs are 87.65% consistent with real human gaze shifting paths, as verified by the eye-tracking data; and 2) our photo aesthetic measure outperforms many of its competitors.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Perception-Guided Multimodal Feature Fusion for Photo Aesthetics Assessment

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A Biologically Inspired Automatic System for Media Quality Assessment
Luming Zhang ... Chaoqun Hong
IEEE Transactions on Automation Science and Engineering | VOL. 13
Luming Zhang, et. al.Luming Zhang ... Chaoqun Hong
01 Apr 2016
IEEE Transactions on Automation Science and Engineering | VOL. 13

Actively learning human gaze shifting paths for semantics-aware photo cropping.
Luming Zhang ... Rongrong Ji
IEEE transactions on image processing : a publication of the IEEE Signal Processing Society | VOL. 23
Luming Zhang, et. al. Luming Zhang ... Rongrong Ji
01 May 2014
IEEE transactions on image processing : a publication of the IEEE Signal Processing Society | VOL. 23

Fusion of multichannel local and global structural cues for photo aesthetics evaluation.
Luming Zhang ... Xuelong Li
IEEE Transactions on Image Processing | VOL. 23
Luming Zhang, et. al.Luming Zhang ... Xuelong Li
01 Mar 2014
IEEE Transactions on Image Processing | VOL. 23

Semantic Photo Retargeting Under Noisy Image Labels
Luming Zhang ... Xuelong Li
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. 12
Luming Zhang, et. al.Luming Zhang ... Xuelong Li
20 May 2016
ACM Transactions on Multimedia Computing, Communications, and Applications | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Perception-Guided Multimodal Feature Fusion for Photo Aesthetics Assessment

Abstract

Talk to us

Similar Papers