Abstract

Emotion estimation by speech increase precision through the development of deep learning. However, most of the emotion estimation using deep learning involves supervised learning, and it is difficult to get a large data set used for learning. In addition, when the training data environment and the actual data environment are significantly different, it is considered as a problem that the accuracy of emotion estimation greatly deteriorates. Therefore, in this study, in order to solves these problems, we used a smooth emotion estimation model by using virtual adversarial training (VAT), which is a semi supervised learning method, that improves the robustness of the model. VAT attracts attention in machine learning as a method of smoothing a generation model by adding minute and intentional perturbation to training data in learning. We first set hyperparameters in VAT by verification with single corpus and then perform evaluation experiments with cross corpus to show the improvement of model robustness.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.