Abstract

In response to the growth of digital photography and its many related applications, researchers have been actively investigating methods for providing automated aesthetical evaluation and classification of photographs. For computational networks to recognize aesthetic qualities, the learning algorithms must be trained using sample sets of characteristics that have known aesthetic values. Traditional methods for developing this training have required manual extraction of aesthetic features for use in the practice datasets. With abundant appearance of convolutional neural networks (CNN), the networks have learned features automatically and have acted as important tools for evaluation and classification. At the time of our research, several existing convolutional neural networks for photograph aesthetical classification only used shallow depth networks, which limit the improvement of performance. In addition, most methods have extracted only one patch as a training sample, such as a down-sized crop from each image. However, a single patch might not represent the entire image accurately, which could cause ambiguity during training. What's more, for existing datasets, the numbers of high quality images of each category are mostly too small to train deep CNN networks. To solve these problems, we introduce a novel photograph aesthetic classifier with a deep and wide CNN for fine-granularity aesthetical quality prediction. First, we download a large number of consumer photographic images from DPChallenge.com (a well-known online photography portal) to construct a dataset suitable for aesthetic quality assessment. Then, we zoom out the images into 256×256 by bilinear interpolation and crop 10 patches (Center+four Corners+Flipping). Once we have associated the set with the image's training labels, we feed the images with the bag of patches into the fine-tuned networks. Our proposed computational method is configured to classify the photographs into high and low aesthetic values. A training pattern specifying an output of (0, 1) indicates that the corresponding image belongs to the “low aesthetic quality” set. Likewise, a training pattern with an output of (1, 0) indicates that the corresponding image belongs to the “high aesthetic quality” set. Experimental results show that the accuracy of classification provided by our method is greater than 87.10%, which is noticeably better than the state-of-the-art methods. In addition, our experiments show that our results are fundamentally consistent with human visual perception and aesthetic judgments.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.