Abstract

Object detection aims to locate and recognize objects in images or videos, which contributes to many downstream intelligent applications. Recently, emerging gigapixel videography has attracted considerable attention from computer vision, microscopy, telescopy and many other communities. Its large field of view and high spatial resolution provide sufficient global and local information simultaneously. Although state-of-the-art detection methods have achieved success in common images, they can not be transferred to gigapixel images with both effectiveness and efficiency. To solve this problem, we make the first attempt towards accurate and real-time object detection in giga-pixel video. In this paper we propose a novel framework, termed as GigaDet, which adopts an efficient global-to-local strategy, following the principle of human vision system. Based on the spatial sparsity of objects, a patch generation network (PGN) is introduced to globally locate possible regions containing objects and determine the proper resize ratio of each patch. Then the collected multi-scale patches are fed into a decorated detector (DecDet) in parallel to perform accurate and fast detection in a local way. We carry out extensive experiments on PANDA dataset and GigaDet yields 76.2% AP and 5 FPS on a single 2080ti GPU, which is comparably accurate but 50x faster than Faster RCNN. We believe this research can inspire new applications based on gigapixel video for a large range of fields.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call