Abstract

This study proposes a polynomial based feature transferring (PFT) algorithm for acoustic feature conversion. The PFT process consists of estimation and conversion phases. The estimation phase aims to compute a polynomial based transfer function using only a small set of parallel source and target features. With the estimated transfer function, the conversion phase converts large sets of source features to target ones. This study evaluates the proposed PFT algorithm using a robust automatic speech recognition (ASR) task on the Aurora-2 database. The source features were MFCCs with cepstral mean and variance normalization (CMVN), and the target features were advanced front end features (AFE). Compared to CMVN, AFE provides better robust speech recognition performance but requires more complicated and expensive cost for feature extraction. By PFT, we intend to use a simple transfer function to obtain AFE-like acoustic features from the source CMVN features. Experimental results on Aurora-2 demonstrate that the PFT generated AFE-like features that can notably improve the CMVN performance and approach results achieved by AFE. Furthermore, the recognition accuracy of PFT was better than that of histogram equalization (HEQ) and polynomial based histogram equalization (PHEQ). The results confirm the effectiveness of PFT with just a few sets of parallel features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call