Abstract

This paper proposes an audio depression recognition method based on convolution neural network and generative antagonism network model. First of all, preprocess the data set, remove the long-term mute segments in the data set, and splice the rest into a new audio file. Then, the features of speech signal, such as Mel-scale Frequency Cepstral Coefficients (MFCCs), short-term energy and spectral entropy, are extracted based on audio difference normalization algorithm. The extracted matrix vector feature data, which represents the unique attributes of the subjects' own voice, is the data base for model training. Then, based on the combination of CNN and GAN, DR AudioNet is used to build the model of depression recognition research. With the help of DR AudioNet, the former model is optimized and the recognition classification is completed through the normalization characteristics of the two adjacent segments before and after the current audio segment. The experimental results on AViD-Corpus and DAIC-WOZ datasets show that the proposed method effectively reduces the depression recognition error compared with other existing methods, and the RMSE and MAE values obtained on the two datasets are better than the comparison algorithm by more than 5%.

Highlights

  • With the improvement of people’s material life, mental health issues have received widespread attention

  • We proposed a novel deep learning algorithm which combine convolutional neural network (CNN) and generative antagonism network (GAN) and for Automatic Speech Depression Detection (ASDD)

  • This paper focuses on these three problems, and proposes an audio depression recognition method based on convolution neural network and generative antagonism network model

Read more

Summary

INTRODUCTION

With the improvement of people’s material life, mental health issues have received widespread attention. Clinical observations and studies have found that there is a significant correlation between the audio characteristics and the depression degrees [4], [5]. Z. Wang et al.: Recognition of Audio Depression Based on CNN and Generative Antagonism Network Model has been the focus of scholars due to its advantages of low cost, easy collection and non-contact [9], [10]. Compared with traditional machine learning methods, deep learning models can extracting high-level semantic features based on the neural network framework, which has brought breakthough progress in recent years. We proposed a novel deep learning algorithm which combine convolutional neural network (CNN) and generative antagonism network (GAN) and for ASDD. This paper focuses on these three problems, and proposes an audio depression recognition method based on convolution neural network and generative antagonism network model

RELATED WORK
DATA PREPROCESSING
Calculation of the spectrogram entropy
AUDIO DEPRESSION REGRESSION PREDICTION NETWORK
EXPERIMENTAL RESULTS AND ANALYSIS
EVALUATING INDICATOR
MODEL TRAINING
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.