Consistency Regularization을 적용한 멀티모달 한국어 감정인식

Jounghee Kim,Pilsung Kang

doi:10.7232/jkiie.2021.47.6.549

Abstract

Recently, the demand for artificial intelligence-based voice services, identifying and appropriately responding to user needs based on voice, is increasing. In particular, technology for recognizing emotions, which is non-verbal information of human voice, is receiving significant attention to improve the quality of voice services. Therefore, speech emotion recognition models based on deep learning is actively studied with rich English data, and a multi-modal emotion recognition framework with a speech recognition module has been proposed to utilize both voice and text information. However, the framework with speech recognition module has a disadvantage in an actual environment where ambient noise exists. The performance of the framework decreases along with the decrease of the speech recognition rate. In addition, it is challenging to apply deep learning-based models to Korean emotion recognition because, unlike English, emotion data is not abundant. To address the drawback of the framework, we propose a consistency regularization learning methodology that can reflect the difference between the content of speech and the text extracted from the speech recognition module in the model. We also adapt pre-trained models with self-supervised way such as Wav2vec 2.0 and HanBERT to the framework, considering limited Korean emotion data. Our experimental results show that the framework with pre-trained models yields better performance than a model trained with only speech on Korean multi-modal emotion dataset. The proposed learning methodology can minimize the performance degradation with poor performing speech recognition modules.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Consistency Regularization을 적용한 멀티모달 한국어 감정인식

Abstract

Talk to us

Similar Papers

More From: Journal of the Korean Institute of Industrial Engineers

Lead the way for us

Similar Papers

Design of a voice controlled robotic gripper arm using neural networks
Fariha Musharrat Haque ... Asif Shahriyar Sushmit
-
Fariha Musharrat Haque, et. al.Fariha Musharrat Haque ... Asif Shahriyar Sushmit
01 Aug 2017
01 Aug 2017

Smart sensor integration: A framework for multimodal emotion recognition in real-time
Johannes Wagner ... Elisabeth Andre
-
Johannes Wagner, et. al.Johannes Wagner ... Elisabeth Andre
01 Sep 2009
01 Sep 2009

Polish Speech and Text Emotion Recognition in a Multimodal Emotion Analysis System
Kamil Skowroński ... Eryka Probierz
Applied Sciences | VOL. 14
Kamil Skowroński, et. al.Kamil Skowroński ... Eryka Probierz
08 Nov 2024
Applied Sciences | VOL. 14

MMTrans-MT: A Framework for Multimodal Emotion Recognition Using Multitask Learning
Jinrui Shen ... Xiaoping Wang
-
Jinrui Shen, et. al.Jinrui Shen ... Xiaoping Wang
14 May 2021
14 May 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Consistency Regularization을 적용한 멀티모달 한국어 감정인식

Abstract

Talk to us

Similar Papers

More From: Journal of the Korean Institute of Industrial Engineers