Auditory Sparse Representation for Robust Speaker Recognition Based on Tensor Structure

Qiang Wu,Liqing Zhang

doi:10.1155/2008/578612

Qiang Wu, Liqing Zhang

Open Access

https://doi.org/10.1155/2008/578612

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text
Similar Papers

Abstract

Listen

This paper investigates the problem of speaker recognition in noisy conditions. A new approach called nonnegative tensor principal component analysis (NTPCA) with sparse constraint is proposed for speech feature extraction. We encode speech as a general higher-order tensor in order to extract discriminative features in spectrotemporal domain. Firstly, speech signals are represented by cochlear feature based on frequency selectivity characteristics at basilar membrane and inner hair cells; then, low-dimension sparse features are extracted by NTPCA for robust speaker modeling. The useful information of each subspace in the higher-order tensor can be preserved. Alternating projection algorithm is used to obtain a stable solution. Experimental results demonstrate that our method can increase the recognition accuracy specifically in noisy environments.

Highlights

Automatic speaker recognition has been developed into an important technology for various speech-based applications
We propose a new feature extraction method for robust speaker recognition based on auditory periphery model and tensor structure
The results show that auditory-based nonnegative tensor cepstral coefficients (ANTCCs) feature demonstrates good performance in the presence of four noises

Summary

Introduction

Automatic speaker recognition has been developed into an important technology for various speech-based applications. Traditional recognition system usually comprises two processes: feature extraction and speaker modeling. Conventional speaker modeling methods such as Gaussian mixture models (GMMs) [1] achieve very high performance for speaker identification and verification tasks on highquality data when training and testing conditions are well controlled. In many practical applications, such systems generally cannot achieve satisfactory performance for a large variety of speech signals corrupted by adverse conditions such as environmental noise and channel distortions. Traditional GMM-based speaker recognition system, as we know, degrades significantly under adverse noisy conditions, which is not applicable to most real-world problems. How to capture robust and discriminative feature from acoustic data becomes important. Main efforts are focused on reducing the effect of noises and distortions

Methods

Findings

Discussion

Conclusion

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: EURASIP Journal on Audio, Speech, and Music Processing	Publication Date: Jan 1, 2008
Citations: 10	License type: cc-by

R Discovery Prime

Auditory Sparse Representation for Robust Speaker Recognition Based on Tensor Structure

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: EURASIP Journal on Audio, Speech, and Music Processing

Lead the way for us

Similar Papers

Nonnegative Tensor PCA and Application to Speaker Recognition in Noise Environments
Qiang Wu ... Liqing Zhang
-
Qiang Wu, et. al.Qiang Wu ... Liqing Zhang
01 Jan 2008
01 Jan 2008

Robust Speaker Modeling Based on Constrained Nonnegative Tensor Factorization
Qiang Wu ... Guangchuan Shi
-
Qiang Wu, et. al.Qiang Wu ... Guangchuan Shi
24 Sep 2008
24 Sep 2008

Robust feature extraction for speaker recognition based on constrained nonnegative tensor factorization
...
Journal of Computer Science and Technology | VOL. 25
, et. al. ...
11 Jul 2010
Journal of Computer Science and Technology | VOL. 25

Cochlear dead zones: What are they and how do you detect them?
Ted Venema ... Robert Martin
The Hearing Journal | VOL. 62
Ted Venema, et. al.Ted Venema ... Robert Martin
01 Jul 2009
The Hearing Journal | VOL. 62

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Auditory Sparse Representation for Robust Speaker Recognition Based on Tensor Structure

Abstract

Highlights

Summary

Published Version

Talk to us

Similar Papers

More From: EURASIP Journal on Audio, Speech, and Music Processing