Abstract

In speech emotion recognition (SER), speech data is usually captured from different scenarios, which often leads to significant performance degradation due to the inherent mismatch between training and test set. To cope with this problem, we propose a domain adaptation method called Sharing Priors between Related Source and Target classes (SPRST) based on a two-layer neural network. The classifier parameters, namely the weights of the second layer, are imposed the common priors between the related classes, so that the classes with few labeled data in target domain can borrow knowledge from the related classes in source domain. The method is evaluated on the INTERSPEECH 2009 Emotion Challenge two-class task. Experimental results show that our approach significantly improves the performance when only a small number of target labeled instances are available.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call