Abstract

Photo aesthetic quality evaluation is a challenging task in artificial intelligence systems. In this paper, we propose a biologically inspired aesthetic descriptor that mimicks humans sequentially perceiving visually/semantically salient In general, visually salient regions are perceived by low-level visual features, such as the high contrast between the foreground and the background objects; while semantically salient regions are perceived by high-level visual features such as human faces. regions in a photo. In particular, a weakly supervised learning paradigm is developed to project the local image descriptors into a low-dimensional semantic space. Then, each graphlet can be described by multiple types of visual features, both in low-level and in high-level. Since humans usually perceive only a few salient regions in a photo, a sparsity-constrained graphlet ranking algorithm is proposed that seamlessly integrates both the low-level and the high-level visual cues. Top-ranked graphlets are those visually/semantically prominent local aesthetic descriptors in a photo. They are sequentially linked into a path that simulates humans actively viewing process. Finally, we learn a probabilistic aesthetic measure based on such actively viewing paths (AVPs) from the training photos. Experimental results show that: 1) the AVPs are 87.65% consistent with real human gaze shifting paths, as verified by the eye-tracking data and 2) our aesthetic measure outperforms many of its competitors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call