Abstract

Current object detection techniques have difficulties in detecting small objects and a low level of accuracy in detecting occluded objects. To solve these problems, this paper proposes an object detection framework named FFAN which is based on Faster R-CNN that introduces a feature fusion network and an adversary occlusion network into the structure. The feature fusion network combines a feature map of low resolution and high semantic information with a feature map of high resolution and low semantic information using the deconvolution operation to increase the ability to extract low-level features in the network. FFAN then generates a single advanced feature map with high resolution and high semantic information which is used to predict the detection of small objects in the image more effectively. The adversary occlusion network creates occlusion on a deep feature map of the object, and generates an adversary training sample that is difficult for the detector to discriminate. At the same time, the detector classifies accurately the generated occluded adversary samples by self-learning. The two compete with and learn from each other to further improve the performance of the algorithm. We train FFAN on the PASCAL VOC 2007, PASCAL VOC 2012, MS COCO and KITTI datasets. A number of quantitative and qualitative experiments show that FFAN achieves a state-of-the-art detection accuracy.

Highlights

  • In recent years, with the rapid development of deep neural networks, object detection technology based on deep learning has made great progress

  • To solve the problem of the poor detection results for small objects, and to provide a way of generating samples with different occlusions instead of generating the pixels directly, this paper designs a new object detection framework named FFAN which is based on Faster R-CNN [4] that introduces a feature fusion network into the structure and includes an adversary occlusion network that creates occlusion on the deep feature map of an object after the multilayer feature is fused in order to improve the detection accuracy of a partially occluded object

  • As the FFAN detection network is based on Faster R-CNN, we briefly review of the Faster R-CNN network before describing the structure of the feature fusion network

Read more

Summary

INTRODUCTION

With the rapid development of deep neural networks, object detection technology based on deep learning has made great progress. A number of methods have been used to generate a variety of images [14]–[22], and a collection of partially occluded object instances can be generated by Generative Adversarial Networks (GAN), which can generate realistic images This is not a reliable solution because generating these images requires a large number of similar training samples. To solve the problem of the poor detection results for small objects, and to provide a way of generating samples with different occlusions instead of generating the pixels directly, this paper designs a new object detection framework named FFAN which is based on Faster R-CNN [4] that introduces a feature fusion network into the structure and includes an adversary occlusion network that creates occlusion on the deep feature map of an object after the multilayer feature is fused in order to improve the detection accuracy of a partially occluded object

RELATED WORK
ADVERSARY LEARNING
NETWORK ARCHITECTURE
EXPERIMENTAL RESULTS AND ANALYSIS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call