Optimized convolutional neural network architectures for efficient on-device vision-based object detection

Ivan Rodriguez-Conde,Celso Campos,Florentino Fdez-Riverola

doi:10.1007/s00521-021-06830-w

Abstract

Convolutional neural networks have pushed forward image analysis research and computer vision over the last decade, constituting a state-of-the-art approach in object detection today. The design of increasingly deeper and wider architectures has made it possible to achieve unprecedented levels of detection accuracy, albeit at the cost of both a dramatic computational burden and a large memory footprint. In such a context, cloud systems have become a mainstream technological solution due to their tremendous scalability, providing researchers and practitioners with virtually unlimited resources. However, these resources are typically made available as remote services, requiring communication over the network to be accessed, thus compromising the speed of response, availability, and security of the implemented solution. In view of these limitations, the on-device paradigm has emerged as a recent yet widely explored alternative, pursuing more compact and efficient networks to ultimately enable the execution of the derived models directly on resource-constrained client devices. This study provides an up-to-date review of the more relevant scientific research carried out in this vein, circumscribed to the object detection problem. In particular, the paper contributes to the field with a comprehensive architectural overview of both the existing lightweight object detection frameworks targeted to mobile and embedded devices, and the underlying convolutional neural networks that make up their internal structure. More specifically, it addresses the main structural-level strategies used for conceiving the various components of a detection pipeline (i.e., backbone, neck, and head), as well as the most salient techniques proposed for adapting such structures and the resulting architectures to more austere deployment environments. Finally, the study concludes with a discussion of the specific challenges and next steps to be taken to move toward a more convenient accuracy–speed trade-off.

Highlights

Despite being widely studied over the last three decades, object detection still represents a highly complex problem and remains an uphill challenge of great interest in research
First presented in 1989 by LeCun et al [7], convolutional neural networks (CNNs) have emerged over the last decade as a major driver of progress in image analysis and computer vision, delivering state-of-the-art results in terms of accuracy. Though statistical classifiers, such as support vector machines (SVM) [8], Random Forest [9], Adaboost [10], or traditional neural networks, were considered the standard in computer vision for many years and had a leading role in object detection tasks, and the relatively recent breakthrough of deep learning (DL) techniques represents an unquestionable leap over previous object detection research, enabling the detection of objects in more complex situations and the simplification of the design process of pursued algorithmic solutions
Within the group of micro approaches, we find a wide range of options that can be categorized into two distinct subgroups: an initial collection of techniques that focus on convolutional-filterspecific aspects or properties such as the number of filters [107], the size of these in the spatial dimension [49], the number of channels [49, 52, 101, 105, 107, 130], the communication between them [50, 54], or the number of channel groups [101]; and a second subgroup encompassing methods targeting the internal structure of layers or modules such as the exploitation of alternative operations to convolution [47, 48, 50, 52, 54, 105, 107, 128, 130, 131, 133,134,135,136], the replacement [48] or omission [133] of nonlinearity, or the application of an attention mechanism [53, 132, 133]

Summary

Introduction

Despite being widely studied over the last three decades, object detection still represents a highly complex problem and remains an uphill challenge of great interest in research. First presented in 1989 by LeCun et al [7], convolutional neural networks (CNNs) have emerged over the last decade as a major driver of progress in image analysis and computer vision, delivering state-of-the-art results in terms of accuracy Though statistical classifiers, such as support vector machines (SVM) [8], Random Forest [9], Adaboost [10], or traditional neural networks, were considered the standard in computer vision for many years and had a leading role in object detection tasks, and the relatively recent breakthrough of deep learning (DL) techniques represents an unquestionable leap over previous object detection research, enabling the detection of objects in more complex situations and the simplification of the design process of pursued algorithmic solutions. CNNs represent a comprehensive detection solution that, due to their ability to exploit both spatial and temporal correlation of input data, enables feature representation learning to be carried out directly with no need of domain expertise, an essential requirement to design feature extraction algorithms such as shift invariant feature transform (SIFT) [11], histogram of oriented gradients (HOG) [12], or local binary patterns (LBP) [13], which are omnipresent among the more classical approaches

Objectives

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Neural Computing and Applications	Publication Date: Dec 27, 2021
Citations: 16	License type: open-access

R Discovery Prime

R Discovery Prime

Optimized convolutional neural network architectures for efficient on-device vision-based object detection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Neural Computing and Applications

Lead the way for us

Similar Papers

Cascade Object Detection and Remote Sensing Object Detection Method Based on Trainable Activation Function
S N Shivappriya ... B D Parameshachari
Remote Sensing | VOL. 13
S N Shivappriya, et. al.S N Shivappriya ... B D Parameshachari
08 Jan 2021
Remote Sensing | VOL. 13

Pre-Review Convolutional Neural Network for Detecting Object in Image Comprehensive Survey and Analysis
Fidelis Gonten ... Abdulsalam Ya’U Gital
Journal of Information Systems and Technology Research | VOL. 3
Fidelis Gonten, et. al.Fidelis Gonten ... Abdulsalam Ya’U Gital
31 May 2024
Journal of Information Systems and Technology Research | VOL. 3

Object Detectors’ Convolutional Neural Networks backbones : a review and a comparative study
-
International Journal of Emerging Trends in Engineering Research | VOL. 9
--
06 Nov 2021
International Journal of Emerging Trends in Engineering Research | VOL. 9

Accelerating Tiny YOLOv3 using FPGA-Based Hardware/Software Co-Design
Afzal Ahmad ... Muhammad Adeel Pasha
-
Afzal Ahmad, et. al.Afzal Ahmad ... Muhammad Adeel Pasha
01 Oct 2020
01 Oct 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Optimized convolutional neural network architectures for efficient on-device vision-based object detection

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Neural Computing and Applications