Abstract

Automatic image cropping is a completely practical but challenging task which aims to improve the aesthetic quality of an image by removing irrelevant areas. Most previous image cropping methods ignored compositional relationships among different regions of a given image. Global compositional relationships are extremely important for cropping models to decide whether to reserve a certain object of an input image. In this work, we propose a multi-scale attention network (MSANet) to address this issue. We employ three plug-and-play attention modules to catch the context on three different scales. The multi-scale attention (MSA) module ensures that our model perceives objects of different sizes and preserve needed areas. Moreover, we design a border-reserved grid anchor based formulation to better handle the situations where the subjects are at the edge of input images. The cosine similarity loss function is also utilized to acquire stable results. Extensive quantitative and qualitative experimental results show that our model is well aware of the compositional relationships of images. Compared to existing works, our multi-scale attention network achieves state-of-the-art performance with less time and lighter weights.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call