Abstract

This paper employs state-of-the-art Deep Convolutional Neural Networks (CNNs), namely AlexNet, VGGNet, Inception, ResNet and ResNeXt in a first experimental study of ear recognition on the unconstrained EarVN1.0 dataset. As the dataset size is still insufficient to train deep CNNs from scratch, we utilize transfer learning and propose different domain adaptation strategies. The experiments show that our networks, which are fine-tuned using custom-sized inputs determined specifically for each CNN architecture, obtain state-of-the-art recognition performance where a single ResNeXt101 model achieves a rank-1 recognition accuracy of 93.45%. Moreover, we achieve the best rank-1 recognition accuracy of 95.85% using an ensemble of fine-tuned ResNeXt101 models. In order to explain the performance differences between models and make our results more interpretable, we employ the t-SNE algorithm to explore and visualize the learned features. Feature visualizations show well-separated clusters representing ear images of the different subjects. This indicates that discriminative and ear-specific features are learned when applying our proposed learning strategies.

Highlights

  • Ear recognition refers to the process of automated human recognition based on the physical characteristics of the ears

  • We address the problem of recognizing ear images collected under unconstrained conditions through utilizing transfer learning with deep Convolutional Neural Networks (CNNs), and report the results of the first ear recognition experiments on the EarVN1.0 dataset

  • The first set of experiments is to determine how good the pretrained ImageNet models perform in representing ear images from the EarVN1.0 dataset

Read more

Summary

Introduction

Ear recognition refers to the process of automated human recognition based on the physical characteristics of the ears. There are several desirable characteristics of the human ears which include: ease of capture from a distance, stability over time, ability to identify identical twins [1], and being insensitive to emotions and facial expressions [2], [3]. Given these appealing features we can build and develop reliable recognition systems on numerous devices in a non-intrusive and non-distracting manner [4]–[6]. An accurate recognition can be a challenging task when ear images are acquired in unconstrained environments where various

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.