Abstract

This paper proposes an approach to perform accent adaptation by using accent dependent bottleneck (BN) features to improve the performance of multi-accent Mandarin speech recognition system. The architecture of the adaptation uses two neural networks. First, deep neural network (DNN) acoustic model acts as a feature extractor which is used to extract accent dependent BN (BN-DNN) features. The input features of the BN-DNN model are MFCC features appended with i-vectors features. Second, bidirectional long short term memory (BLSTM) recurrent neural network (RNN) based acoustic model is used to perform accent-specific adaptation. The input features of the BLSTM RNN model are accent dependent BN features appended with MFCC features. Experiments on RASC863 and CASIA regional accent speech corpus show that the proposed method obtains obvious improvement compared with the BLSTM RNN baseline model.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.