Image Resizing for Object Detection: A Learnable Downsampler–Upsampler Pair with Differentiable Image Entropy Estimation

Chengjie Dai,Bowei Yang,Guanghua Song,Qiang Chen,Hanshen Gong,Jingchao Xu

doi:10.1142/s0218001423540113

Abstract

In recent years, super-resolution neural networks have achieved good results in restoring super-resolution images from low-resolution ones. However, most subsequent tasks based on super-resolution images such as object detection are done by the computer. Considering this situation, we propose a learnable downsampler–upsampler pair, which can realize both the downscaling process and the upscaling process by neural networks, and is jointly trained with the YOLOV5 network to optimize the object detection task. Thus, different from existing super-resolution networks, the entire downsampler–upsampler pair is optimized for machine perception. In addition, to further reduce the size of the downsampled images, we also propose a differentiable method for estimating image entropy and add it to the loss function. We verify the effectiveness of our method on the pothole dataset and use scale factors 2× and 4× to prove that our method is capable of diverse resizing levels. The experimental results show that using our learnable downsampler–upsampler pair as a resizing method can highly improve the detection performance compared with other resizing techniques.

Full Text