Proposals Generation for Weakly Supervised Object Detection in Artwork Images.

Federico Milani,Piero Fraternali,Nicolò Oreste Pinciroli Vago

doi:10.3390/jimaging8080215

Federico Milani, Piero Fraternali + Show 1 more

Open Access

https://doi.org/10.3390/jimaging8080215

Copy DOI

Journal: Journal of imaging	Publication Date: Aug 6, 2022
Citations: 3	License type: CC BY 4.0

Affiliation: Politecnico di Milano

Abstract

Object Detection requires many precise annotations, which are available for natural images but not for many non-natural data sets such as artworks data sets. A solution is using Weakly Supervised Object Detection (WSOD) techniques that learn accurate object localization from image-level labels. Studies have demonstrated that state-of-the-art end-to-end architectures may not be suitable for domains in which images or classes sensibly differ from those used to pre-train networks. This paper presents a novel two-stage Weakly Supervised Object Detection approach for obtaining accurate bounding boxes on non-natural data sets. The proposed method exploits existing classification knowledge to generate pseudo-ground truth bounding boxes from Class Activation Maps (CAMs). The automatically generated annotations are used to train a robust Faster R-CNN object detector. Quantitative and qualitative analysis shows that bounding boxes generated from CAMs can compensate for the lack of manually annotated ground truth (GT) and that an object detector, trained with such pseudo-GT, surpasses end-to-end WSOD state-of-the-art methods on ArtDL 2.0 (≈41.5% mAP) and IconArt (≈17% mAP), two artworks data sets. The proposed solution is a step towards the computer-aided study of non-natural images and opens the way to more advanced tasks, e.g., automatic artwork image captioning for digital archive applications.

Full Text