Sparse coding based lip texture representation for visual speaker identification

Jun-Yao Lai,Shi-Lin Wang,Xing-Jian Shi,Alan Wee-Chung Liew

doi:10.1109/icdsp.2014.6900736

Abstract

Recent research has shown that the speaker's lip shape and movement contain rich identity-related information and can be adopted for speaker identification and authentication. Among all the static lip features, the lip texture (intensity variation inside the outer lip contour) is of high discriminative power to differentiate various speakers. However, the existing lip texture feature representations cannot describe the texture information adequately and provide unsatisfactory identification results. In this paper, a sparse representation of the lip texture is proposed and a corresponding visual speaker identification scheme is presented. In the training stage, a sparse dictionary is built based on the texture samples for each speaker. In the testing stage, for any lip image investigated, the lip texture information is extracted and the reconstruction errors using all the dictionaries for every speaker are calculated. The lip image is identified to the speaker with the minimum reconstruction error. The experimental results show that the proposed sparse coding based scheme can achieve much better identification accuracy (91.37% for isolate image and 98.21% for image sequence) compared with several state-of-the-art methods when considering the lip texture information only.

Full Text