Abstract
This chapter introduces a neural network based approach for the identification of human affective state in speech signals. A group of potential features are first identified and extracted to represent the characteristics of different emotions. To reduce the dimensionality of the feature space, whilst increasing the discriminatory power of the features, a systematic feature selection approach which involves the application of sequential forward selection (SFS) with a general regression neural network (GRNN) in conjunction with a consistency-based selection method is presented. The selected parameters are employed as inputs to the a modular neural network, consisting of sub-networks, where each sub-network specializes in a particular emotion class. Comparing with the standard neural network, this modular architecture allows decomposition of a complex classification problem into small subtasks such that the network may be tuned based on the characteristics of individual emotion. The performance of the proposed system is evaluated for various subjects, speaking different languages. The results show that the system produces quite satisfactory emotion detection performance, yet demonstrates a significant increase in versatility through its propensity for language independence.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.