Abstract

The paper presents a solution to the problem of person recognition by social robots via a novel brain-inspired multi-modal perceptual system. The system employs spiking neural network to integrate face, body features, and voice data to recognize a person in various social human-robot interaction scenarios. We suggest that, by and large, most reported multi-biometric person recognition algorithms require active participation by the subject and as such are not appropriate for social human-robot interactions. However, the proposed algorithm relaxes this constraint. As there are no public datasets for multimodal systems, we designed a hybrid dataset by integration of the ubiquitous FERET, RGB-D, and TIDIGITS datasets for face recognition, person recognition, and speaker recognition, respectively. The combined dataset facilitates association of facial features, body shape, and speech signature for multimodal person recognition in social settings. This multimodal dataset is employed for testing the algorithm. We assess the performance of the algorithm and discuss its merits against related methods. Within the context of the social robotics, the results suggest the superiority of the proposed method over other reported person recognition algorithms.

Highlights

  • Recognizing people whom we have met before is, an indispensable attribute that is often taken for granted, yet playing a central role in our social interactions

  • Within the context of the social robotics, the results suggest the superiority of the proposed method over other reported person recognition algorithms

  • We would like to emphasize that we present a person recognition algorithm incorporating multimodal biometrics features that is non-intrusive, is not affected by changes in appearance, and works within the range of human social interaction rate

Read more

Summary

Introduction

Recognizing people whom we have met before is, an indispensable attribute that is often taken for granted, yet playing a central role in our social interactions. The cane acts as a sensory substitution or extension” Within this setting, we mostly recognize people from their faces, though other characteristics such as voice, body features, height, and similar attributes often contribute to the recognition process. Consider how we recognize members of our family, co-workers, and close friends Their faces, voices, their body shape, and features, etc., are holistically involved in the recognition process and the absence of one or more of these attributes usually do not influence the outcome of the recognition. Sci. 2018, 8, 387 biometric parameters for person re-identification for applications such as surveillance, security, or forensics systems These features/parameters are derived from physiological and/or behavioral characteristics of humans, such as fingerprint, palm-print, iris, hand vein, body, face, gait, voice, signature, and keystrokes.

Related Studies
Architecture
Front-End Sensors and Preprocessing
Feature Extraction
Vision-Based Feature Vectors
Voice-Based
12 S lowest
Dedicated
Dedicated Processing Units for Vision-Based Feature Vectors
Dedicated Processing Units for Voice-Based Feature Vector
Temporal Binding via Spiking Neural
Temporal
Results
Generation
Experiment 1
The recognition
Experiment
Experiment 4
Discussions
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call