Abstract

Due to the lack of annotations in target bounding boxes, most methods for weakly supervised target detection transform the problem of object detection into a classification problem of candidate regions, making it easy for weakly supervised target detectors to locate significant and highly discriminative local areas of objects. We propose a weak monitoring method that combines attention and erasure mechanisms. The supervised target detection method uses attention maps to search for areas with higher discrimination within candidate regions, and then uses an erasure mechanism to erase the region, forcing the model to enhance its learning of features in areas with weaker discrimination. To improve the positioning ability of the detector, we cascade the weakly supervised target detection network and the fully supervised target detection network, and jointly train the weakly supervised target detection network and the fully supervised target detection network through multi-task learning. Based on the validation trials, the category mean average precision (mAP) and the correct localization (CorLoc) on the two datasets, i.e., VOC2007 and VOC2012, are 55.2% and 53.8%, respectively. In regard to the mAP and CorLoc, this approach significantly outperforms previous approaches, which creates opportunities for additional investigations into weakly supervised target identification algorithms.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call