Linear Regression-based Classifier for audio visual person identification

M R Alam,F Sohel,M Bennamoun,R Togneri,I Naseem

doi:10.1109/iccspa.2013.6487281

Abstract

This paper presents an audio visual (AV) person identification system using Linear Regression-based Classifier (LRC) for person identification. Class specific models are created by stacking q-dimensional speech and image vectors from the training data. The person identification task is considered a linear regression problem, i.e., a test (speech or image) feature vector is expressed as a linear combination of the (speech or image) model of the class it belongs to. The Euclidean distance between a test feature vector and the estimated response vectors for all the class specific models are used as matching scores. These matching scores from both modalities are normalized using the min-max score normalization technique and then combined using the the sum rule of fusion. The system was tested on 88 subjects from the AusTalk AV database. Experimental results show that the identification accuracy after AV fusion is higher compared to the identification accuracy of an individual modality.

Full Text