Abstract

Neural networks have enabled state-of-the-art approaches to achieve incredible results on computer vision tasks such as object detection. However, previous works have tried to improve the performance in various object detection necks but have failed to extract features efficiently. To solve the insufficient features of objects, this work introduces some of the most advanced and representative network models based on the Faster R-CNN architecture, such as Libra R-CNN, Grid R-CNN, guided anchoring, and GRoIE. We observed the performance of Neighbour Feature Pyramid Network (NFPN) fusion, ResNet Region of Interest Feature Extraction (ResRoIE) and the Recursive Feature Pyramid (RFP) architecture at different scales of precision when these components were used in place of the corresponding original members in various networks obtained on the MS COCO dataset. Compared to the experimental results after replacing the neck and RoIE parts of these models with our Reinforced Neighbour Feature Fusion (RNFF) model, the average precision (AP) is increased by 3.2 percentage points concerning the performance of the baseline network.

Highlights

  • Target detection is an essential task in deep learning; it answers the question “what objects are located where”

  • We report Neighbour Feature Pyramid Network (NFPN) experiments conducted on LISA [26], and Table 1 shows the enhancement of the average precision (AP) and the advantages for objects of different scales

  • The application of our Reinforced Neighbour Feature Fusion (RNFF) method in combination with other recent networks is investigated by replacing the original neck and RoIE parts of the networks

Read more

Summary

Introduction

Target detection is an essential task in deep learning; it answers the question “what objects are located where”. Traditional object detection algorithms mainly use artificially designed feature modeling to extract geometric information such as edges, colors, and textures and detect them through support vector machines. With the advancement of deep learning, detection algorithms using convolutional neural networks have been gradually proposed, and the detection accuracy has been greatly improved. It has a potential impact on the development of the fundamentals of deep learning techniques, and it may help to reduce the amount of required labeled data in many deep learning tasks, such as recognition, instance segmentation, etc. The one-stage object detection algorithm does not require a region of interest suggestion network, and the features extracted by the deep convolutional network are directly classified and the object position coordinate value, such as SSD [2], YOLOv1 [3], YOLOv2 [4], YOLOv3 [5], YOLOv4 [6], RetinaNet [7], CornerNet [8], CornerNet-Lite [9], CenterNet [10], FCOS [11], ExtremeNet [12], etc

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call