Impact of Image Resizing on Deep Learning Detectors for Training Time and Model Performance

Sergio Saponara,Abdussalam Elhanashi

doi:10.1007/978-3-030-95498-7_2

Abstract

Resizing images is a critical pre-processing step in computer vision. Principally, deep learning models train faster on small images. A larger input image requires the neural network to learn from four times as many pixels, and this increase the training time for the architecture. In this work, we presented the evolution of effects of image resizing on model training time and performance. This study is applied on a vehicle dataset. We used You Look Only Once based architectures which include YOLOv2, YOLOv3, YOLOv4, and YOLOv5 with pretrained models to perform object detection. YOLO is designed to detect objects with high accuracy and high speed, which is an advent for real-time applications. Data augmentation method is used in this research to reduce overfitting problems, which approximates the data probability by manipulating the input samples. The experimental results show that if the input image size varies, then it has effects on the training time of the CNN based images classification. Additionally, this research reviewed image resizing and its impacts on the models’ performance in terms of accuracy, precision, and recall.

Full Text