Abstract

Robust speaker identification of Internet of Things (IoT) becomes an active research problem in the field of biometrics, and it has wide applications in commercial fields and public security. In order to improve the robustness of speaker identification of IoT, we propose a novel Stacked Sparse Denoising Auto-encoder (SSDAE) for robust speaker identification in this paper. Firstly, the i-vector features are estimated based on the universal background model and total variability space model. For the second stage, sparsity constraints and corrupting operations are fused together into the general auto-encoder, and then the single sparse denoising auto-encoder is constructed. Then, the SSDAE is built with multi-hidden sparse denoising auto-encoder layers by layer wisely stacking operation. In the final layer of the designed deep neural network, the Softmax layer is included for classification purpose. The experimental results have shown that the proposed robust speaker identification method achieves better performance in robust speaker identification task than the state-of-the-art approaches for different kinds and levels of noisy inputs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call