Abstract
BackgroundRecognition is an essential function of human beings. Humans easily recognize a person using various inputs such as voice, face, or gesture. In this study, we mainly focus on DL model with multi-modality which has many benefits including noise reduction. We used ResNet-50 for extracting features from dataset with 2D data.ResultsThis study proposes a novel multimodal and multitask model, which can both identify human ID and classify the gender in single step. At the feature level, the extracted features are concatenated as the input for the identification module. Additionally, in our model design, we can change the number of modalities used in a single model. To demonstrate our model, we generate 58 virtual subjects with public ECG, face and fingerprint dataset. Through the test with noisy input, using multimodal is more robust and better than using single modality.ConclusionsThis paper presents an end-to-end approach for multimodal and multitask learning. The proposed model shows robustness on the spoof attack, which can be significant for bio-authentication device. Through results in this study, we suggest a new perspective for human identification task, which performs better than in previous approaches.
Highlights
Recognition is an essential function of human beings
Taken together, applying multiple biometric data in a single model has the advantage of using parameters efficiently, and enhances security because these can be processed on a single device
This paper presents a novel approach for multimodal multitask learning which is robust to noise
Summary
Recognition is an essential function of human beings. Humans recognize a person using various inputs such as voice, face, or gesture. We mainly focus on DL model with multi-modality which has many benefits including noise reduction. Results: This study proposes a novel multimodal and multitask model, which can both identify human ID and classify the gender in single step. Through the test with noisy input, using multimodal is more robust and better than using single modality. Through results in this study, we suggest a new perspective for human identification task, which performs better than in previous approaches. Humans recognize a person using various inputs such as voice, face, or gesture. Using multi-modality has many benefits including noise reduction. If a modality has a low signal to noise ratio, it can be compensated by another modality, so the overall system performance is maintained. In the training phase, correlated features among modalities can be trained, and this may lead to
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have