Abstract

Visual media has become one of the most potent means of conveying opinions or sentiments on the web. Millions of photos are being uploaded by the people on famous social networking sites for expressing themselves. The area of visual sentiment analysis is abstract in nature due to the high level of biasing in the human recognition process. This work proposes a residual attention-based deep learning network (RA-DLNet), which examines the problem of visual sentiment analysis. We aim to learn the spatial hierarchies of image features using CNN. Since the local regions in the image convey significant sentiments, we apply residual attention model, which focuses on crucial sentiment-rich, local regions in the image. The significant contribution of this work also includes an exhaustive analysis of seven popular CNN-based architectures such as VGG-16, VGG-19, Inception-Resnet-V2, Inception-V3, ResNet-50, Xception, and NASNet. The impact of fine-tuning on these CNN variants is demonstrated in visual sentiment analysis domain. The extensive experiments on eight popular benchmark data sets are conducted and the performance is measured in terms of accuracy. The comparison of accuracy with similar state-of-the-art exhibits the superiority of the proposed work.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call