Efficient Shot Detector: Lightweight Network Based on Deep Learning Using Feature Pyramid

Chansoo Park,Hyunho Han,Sanghun Lee

doi:10.3390/app11188692

Chansoo Park, Hyunho Han + Show 1 more

Open Access

https://doi.org/10.3390/app11188692

Copy DOI

Journal: Applied sciences	Publication Date: Sep 17, 2021
Citations: 5	License type: CC BY 4.0

Affiliation: Kwangwoon University, University of Ulsan

Abstract

Convolutional-neural-network (CNN)-based methods are continuously used in various industries with the rapid development of deep learning technologies. However, an inference efficiency problem was reported in applications that require real-time performance, such as a mobile device. It is important to design a lightweight network that can be used in general-purpose environments such as mobile environments and GPU environments. In this study, we propose a lightweight network efficient shot detector (ESDet) based on deep training with small parameters. The feature extraction process was performed using depthwise and pointwise convolution to minimize the computational complexity of the proposed network. The subsequent layer was formed in a feature pyramid structure to ensure that the extracted features were robust to multiscale objects. The network was trained by defining a prior box optimized for the data set of each feature scale. We defined an ESDet-baseline with optimal parameters through experiments and expanded it by gradually increasing the input resolution for detection accuracy. ESDet training and evaluation was performed using the PASCAL VOC and MS COCO2017 Dataset. Moreover, the average precision (AP) evaluation index was used for quantitative evaluation of detection performance. Finally, superior detection efficiency was demonstrated through the experiment compared to the conventional detection method.

Highlights

The rapid advancements and current level of computational power of deep learning based methods can be used in several applications, including autonomous driving systems [1], air traffic control [2], and image restoration [3], with high accuracy, which exhibit their capacity to replace the existing and traditional systems
The lightweight deep learning research based on the convolution neural network (CNN), which includes changing the convolutional filter of the network [4], network discovery (e.g., AutoML) [5], and changing the network architecture [6], is being continuously studied to efficiently use limited system resources
We proposed a novel lightweight network, called efficient shot detector (ESDet), for efficient object detection by extracting features required for detection and stacking these extracted features from the EfficientNet backbone into a feature pyramid

Summary

Introduction

The rapid advancements and current level of computational power of deep learning based methods can be used in several applications, including autonomous driving systems [1], air traffic control [2], and image restoration [3], with high accuracy, which exhibit their capacity to replace the existing and traditional systems. The lightweight deep learning research based on the convolution neural network (CNN), which includes changing the convolutional filter of the network [4], network discovery (e.g., AutoML) [5], and changing the network architecture [6], is being continuously studied to efficiently use limited system resources. In lightweight deep learning research, various studies are conducted in improving the convolution filter and network architecture require high computational cost. Several neural networks using this method include the residual neural network (ResNet) [7], dense convolutional network (DenseNet) [8], and MobileNet [9]. MobileNet is a method to perform convolution in units of channels (depthwise) and feature points (pointwise) instead of employing the existing convolution method

Objectives

Methods

Results

Discussion

Conclusion