A Deep Learning Approach for Stress Detection Through Speech with Audio Feature Analysis

Andani Achmad,Intan Sari Areni,Ingrid Nurtanio,Phie Chyan

doi:10.1109/icitisee57756.2022.10057845

Abstract

Stress, a change in psychological reactions from a calm state to an emotional state, is a psychological problem that can negatively impact a person's physical and mental condition. Daily life that is full of pressures can be a stressor that triggers the stress. Various artificial intelligence-based technological approaches are currently used to detect stress through various indicators, one of which is using speech. In this study, a deep learning model based on CNN architecture was developed to detect stress through voice recording using various sound features extracted in the signal domain. The performance evaluation of the model was demonstrated using an open-source dataset (Crema-D and TESS), and the best accuracy value obtained was 97.1% in performing binary classification on stressed and unstressed labelled speech. The highest accuracy was obtained from experiments using various combinations of sound features in the signal domain using a combination of Mel Spectrogram and MFCC features. This evaluation result shows that the deep learning model with the appropriate sound feature extraction can accurately detect stress through voice recording.

Full Text