Combination method air and bone conducted speech for speaker recognition in i-vector space

Satoru Tsuge,Shingo Kuroiwa

doi:10.1121/1.4969170

Abstract

Recently, some new sensors, such as bone-conductive microphones, throat microphones, and non-audible murmur (NAM) microphones, besides conventional condenser microphones have been developed for collecting speech data. Accordingly, some researchers began to study speaker and speech recognition using speech data collected by these new sensors. From these new sensors, we focus on a bone-conductive microphone. Recently, the speaker recognition method based on i-vector will be state-of-the-art. Hence, in this paper, first, we report the speaker recognition performance using bone-conducted speech based on i-vector-based speaker recognition system. In addition, we propose a speaker recognition method combined bone-conducted speech with air-conducted speech. In this paper, we investigate three combination methods, which are a distance combination method, an i-vector combination method, and a feature combination method. To evaluate the proposed methods, we conducted the speaker identification experiments. From experimental results, performance of the bone-conducted speech is almost same as that of the air-conducted speech under the condition of the enrolment and the evaluation speech collected on same session. In addition, the experimental results show that all proposed methods are able to improve the speaker recognition performance of air- and bone-conducted speech.

Full Text