Abstract

Convolutional neural networks (CNNs) have rapidly become the state-of-the-art models for image classification applications. They usually require large groundtruthed datasets for training. Here, we address object identification and recognition in the wild for infrared (IR) imaging in defense applications, where no such large-scale dataset is available. With a focus on robustness issues, especially viewpoint invariance, we introduce a compact and fully convolutional CNN architecture with global average pooling. We show that this model trained from realistic simulation datasets reaches a state-of-the-art performance compared with other CNNs with no data augmentation and fine-tuning steps. We also demonstrate a significant improvement in the robustness to viewpoint changes with respect to an operational support vector machine (SVM)-based scheme.

Highlights

  • As shown on reference datasets such as ImageNet [1], convolutional neural networks (CNNs) have become the state-of-the-art approaches for object classification in images

  • We focus on modular solutions and assume that we are provided with a target detection algorithm, which extracts image patches for a recognition and identification stage

  • Benchmarking experiments on real IR images patches demonstrate the relevance of the cfCNN with respect to state-of-the-art CNN architectures and an approach based on support vector machines (SVM)

Read more

Summary

Introduction

As shown on reference datasets such as ImageNet [1], convolutional neural networks (CNNs) have become the state-of-the-art approaches for object classification in images. Since a high robustness is of key importance in defense applications, modular solutions that would be easier to understand and evaluate may be preferred, for instance, with distinct modules for detection and classification Based on this observation, we focus on modular solutions and assume that we are provided with a target detection algorithm, which extracts image patches for a recognition and identification stage. Benchmarking experiments on real IR images patches demonstrate the relevance of the cfCNN with respect to state-of-the-art CNN architectures and an approach based on support vector machines (SVM) They stress the importance of realistic synthetic dataset. Given the targeted operational contexts, we study the robustness of our cfCNN against possible perturbations introduced by the detection stage To simulate such behaviour, we train our network on images with centred targets and test it on translated or scaled inputs.

Related Work and Problem Statement
Deep Learning for Recognition and Identification in Infrared Images
Learning Strategies with Simulated Datasets
Viewpoint Invariance
Simulated and Real Infrared Data
Training Datasets
Test Datasets
Description of the cfCNN
Training Consideration
Results
CNN Performance on Real Data for Identification and Recognition
Performance Gains Using Improved Simulation
CNN Performance Comparison
Gains Using Leaky ReLU as the Main Nonlinearity
Dataset Size and Batch Size
Evaluation of the Robustness to Localization and Scaling Errors
Impact of Localization Errors on Identification Performance
Robustness against Scale Modification
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.