Image Privacy Prediction Using Deep Neural Networks

Ashwini Tonge,Cornelia Caragea

doi:10.1145/3386082

Abstract

Images today are increasingly shared online on social networking sites such as Facebook, Flickr, and Instagram. Image sharing occurs not only within a group of friends but also more and more outside a user’s social circles for purposes of social discovery. Despite that current social networking sites allow users to change their privacy preferences, this is often a cumbersome task for the vast majority of users on the Web, who face difficulties in assigning and managing privacy settings. When these privacy settings are used inappropriately, online image sharing can potentially lead to unwanted disclosures and privacy violations. Thus, automatically predicting images’ privacy to warn users about private or sensitive content before uploading these images on social networking sites has become a necessity in our current interconnected world. In this article, we explore learning models to automatically predict appropriate images’ privacy as private or public using carefully identified image-specific features. We study deep visual semantic features that are derived from various layers of Convolutional Neural Networks (CNNs) as well as textual features such as user tags and deep tags generated from deep CNNs. Particularly, we extract deep (visual and tag) features from four pre-trained CNN architectures for object recognition, i.e., AlexNet, GoogLeNet, VGG-16, and ResNet, and compare their performance for image privacy prediction. The results of our experiments obtained on a Flickr dataset of 32,000 images show that ResNet yeilds the best results for this task among all four networks. We also fine-tune the pre-trained CNN architectures on our privacy dataset and compare their performance with the models trained on pre-trained features. The results show that even though the overall performance obtained using the fine-tuned networks is comparable to that of pre-trained networks, the fine-tuned networks provide an improved performance for the private class. The results also show that the learning models trained on features extracted from ResNet outperform the state-of-the-art models for image privacy prediction. We further investigate the combination of user tags and deep tags derived from CNN architectures using two settings: (1) Support Vector Machines trained on the bag-of-tags features and (2) text-based CNN. We compare these models with the models trained on ResNet visual features and show that, even though the models trained on the visual features perform better than those trained on the tag features, the combination of deep visual features with image tags shows improvements in performance over the individual feature sets. We also compare our models with prior privacy prediction approaches and show that for private class, we achieve an improvement of ≈ 10% over prior CNN-based privacy prediction approaches. Our code, features, and the dataset used in experiments are available at https://github.com/ashwinitonge/deepprivate.git.

Full Text