An Efficient Deep Convolutional Neural Network Approach for Object Detection and Recognition Using a Multi-Scale Anchor Box in Real-Time

Vijayakumar Varadarajan,Dweepna Garg,Ketan Kotecha

doi:10.3390/fi13120307

Abstract

Deep learning is a relatively new branch of machine learning in which computers are taught to recognize patterns in massive volumes of data. It primarily describes learning at various levels of representation, which aids in understanding data that includes text, voice, and visuals. Convolutional neural networks have been used to solve challenges in computer vision, including object identification, image classification, semantic segmentation and a lot more. Object detection in videos involves confirming the presence of the object in the image or video and then locating it accurately for recognition. In the video, modelling techniques suffer from high computation and memory costs, which may decrease performance measures such as accuracy and efficiency to identify the object accurately in real-time. The current object detection technique based on a deep convolution neural network requires executing multilevel convolution and pooling operations on the entire image to extract deep semantic properties from it. For large objects, detection models can provide superior results; however, those models fail to detect the varying size of the objects that have low resolution and are greatly influenced by noise because the features after the repeated convolution operations of existing models do not fully represent the essential characteristics of the objects in real-time. With the help of a multi-scale anchor box, the proposed approach reported in this paper enhances the detection accuracy by extracting features at multiple convolution levels of the object. The major contribution of this paper is to design a model to understand better the parameters and the hyper-parameters which affect the detection and the recognition of objects of varying sizes and shapes, and to achieve real-time object detection and recognition speeds by improving accuracy. The proposed model has achieved 84.49 mAP on the test set of the Pascal VOC-2007 dataset at 11 FPS, which is comparatively better than other real-time object detection models.

Highlights

This paper focuses on object detection and recognition
These algorithms use the traditional machine learning approaches, i.e., first performing feature extraction and training the algorithm to achieve the desired output; deep learning algorithms have shown a significant advantage over the traditional machine learning approach by training the algorithm from the data itself
The input is taken as the input, and the output is obtained in the form of class or the probability of the input of that particular class

Summary

Introduction and Scope

A good deep learning algorithm considers a huge number of trained datasets, and the parameters can be tuned. The past work that has been undertaken regarding object detection involves the extraction of the features by using algorithms like HOG [4], SIFT [5], and SURF [6]. These algorithms use the traditional machine learning approaches, i.e., first performing feature extraction and training the algorithm to achieve the desired output; deep learning algorithms have shown a significant advantage over the traditional machine learning approach by training the algorithm from the data itself. The scope of our work is limited to the Pascal VOC dataset [7]

Contributions

Novelty

Outline

Background

Building Blocks of CNN

Detection pipeline using

Faster R-CNN

Mask R-CNN

YOLO Versions

RefineDet512

CenterNet

Limitations

Proposed Architecture

Architecture

Efficient Multi-Scale Anchor Box Approach

Experiments

Findings

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Future Internet	Publication Date: Nov 29, 2021
Citations: 9	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

An Efficient Deep Convolutional Neural Network Approach for Object Detection and Recognition Using a Multi-Scale Anchor Box in Real-Time

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Future Internet

Lead the way for us

Similar Papers

AggNet: Simple Aggregated Network for Real-Time Multiple Object Detection in Road Driving Scene
Tae Hun Kim ... Min-Kook Choi
-
Tae Hun Kim, et. al.Tae Hun Kim ... Min-Kook Choi
01 Nov 2018
01 Nov 2018

A real-time object detection algorithm for video
Xiaoyan Zhang ... Lihao Chen
Computers and Electrical Engineering | VOL. 77
Xiaoyan Zhang, et. al.Xiaoyan Zhang ... Lihao Chen
01 Jul 2019
Computers and Electrical Engineering | VOL. 77

An Accelerated Prototype with Movidius Neural Compute Stick for Real-Time Object Detection
Hafizur Rahaman ... Soumyajit Poddar
-
Hafizur Rahaman, et. al.Hafizur Rahaman ... Soumyajit Poddar
04 Mar 2020
04 Mar 2020

Visual Feature Learning on Video Object and Human Action Detection: A Systematic Review.
Chengjun Xie ... Xiufang Jia
Micromachines | VOL. 13
Chengjun Xie, et. al.Chengjun Xie ... Xiufang Jia
31 Dec 2021
Micromachines | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An Efficient Deep Convolutional Neural Network Approach for Object Detection and Recognition Using a Multi-Scale Anchor Box in Real-Time

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Future Internet