Abstract

To achieve robust speaker verification, we propose a multimodal method which includes additional nonaudio features and glottal activity detector. As a nonaudio sensor an electroglottograph (EGG) is applied. Parameters of EGG signal are used to augment conventional audio feature vector. Algorithm for EGG parameterization is based on the shape of the idealized waveform and glottal activity detector. We compare our algorithm with conventional one in the term of verification accuracy in high noise environment. All experiments are performed using Gaussian Mixture Model recognition system. Obtained results show a significant improvement of the text-independent speaker verification in high noise environment and opportunity for further improvements in this area.

Highlights

  • Speaker Verification (SV) is the process of verifying the claimed identity of a speaker using features extracted from her/his voice

  • Conventional SV uses the recorded audio signal as the sole source of information. This is based on features such as linear predictive cepstral coefficients (LPCC), mel-frequency cepstral coefficients (MFCC), or log area ratio (LAR) [1,2,3]

  • Considering the sensitivity of noise to a conventional speaker verification system, we examined the informativeness of EGG features

Read more

Summary

Introduction

Speaker Verification (SV) is the process of verifying the claimed identity of a speaker using features extracted from her/his voice. In the case of speech being corrupted by environmental noise, the distribution of the audio feature vectors is damaged. For an SV system to be of practical use in a high noise environment it is necessary to address the issue of robustness To combat this problem, researchers have put forward several new algorithms, which assume prior knowledge of the noise, like noise filtering techniques [8, 9], parallel model combination [10,11,12], Jacobian environmental adaptation [13, 14], using microphone arrays [15, 16], or techniques of speech enhancement which target the modeling of speech and noise pdf [17, 18]. The different nature of audio and EGG signals requires specific methods for optimal parameterization

Parameterization
Expected Discrimination Information of EGG Features
Experiments
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call