In recent years, interest has grown in the use biometric systems for identity authentication tasks in digital services, forensic and security applications. A unimodal system (employing a single biometric trait) with high performance is still vulnerable to falsification attacks such as spoofing. For this reason, research on multimodal biometrics (employing various biometric traits) has increased to reinforce security, increase recognition performance, and make false identity authentication more difficult. In this paper, we propose a bimodal system that combines speech and face modalities by concatenating their feature vectors, these vectors are extracted from two convolutional neural networks (CNN) and used for identity verification. The performance of unimodal CNNs was evaluated individually and compared to the bimodal system of concatenated vectors. A data augmentation scheme is used for both modalities to evaluate different operation conditions. Results were measured in terms of Equal Error Rate (EER).
Read full abstract