Abstract

In this paper, we propose an integration process of feature compensation and selection on the collective acoustic feature sets to derive a set of advanced acoustic features for speaker state recognition. For feature normalization, we perform a two-dimensional histogram equalization (2-D HEQ) normalization to reduce variability of speaker and speaking environment factors. For feature selection, we apply a principal component analysis (PCA)-based feature selection to extract meaningful parameters from the original acoustic feature sets and to eliminate redundant components. We conducted experiments on Alcohol Language Corpus (ALC) and Sleepy Language Corpus (SLC) provided in INTERSPEECH 2011 Speaker State Challenge. The openSMILE toolkit is used to extract acoustic features of low-level-descriptors and their related functionals. Experimental results show that the derived acoustic feature set, processed by 2-D HEQ normalization and PCA-based selection, gives improvements over the original feature sets. The results verify that the derived acoustic feature set is a discriminative and compact representation that efficiently exploits multiple knowledge sources from the ensemble acoustic feature sets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.