Abstract
This paper proposes an audio depression recognition method based on convolution neural network and generative antagonism network model. First of all, preprocess the data set, remove the long-term mute segments in the data set, and splice the rest into a new audio file. Then, the features of speech signal, such as Mel-scale Frequency Cepstral Coefficients (MFCCs), short-term energy and spectral entropy, are extracted based on audio difference normalization algorithm. The extracted matrix vector feature data, which represents the unique attributes of the subjects' own voice, is the data base for model training. Then, based on the combination of CNN and GAN, DR AudioNet is used to build the model of depression recognition research. With the help of DR AudioNet, the former model is optimized and the recognition classification is completed through the normalization characteristics of the two adjacent segments before and after the current audio segment. The experimental results on AViD-Corpus and DAIC-WOZ datasets show that the proposed method effectively reduces the depression recognition error compared with other existing methods, and the RMSE and MAE values obtained on the two datasets are better than the comparison algorithm by more than 5%.
Highlights
With the improvement of people’s material life, mental health issues have received widespread attention
We proposed a novel deep learning algorithm which combine convolutional neural network (CNN) and generative antagonism network (GAN) and for Automatic Speech Depression Detection (ASDD)
This paper focuses on these three problems, and proposes an audio depression recognition method based on convolution neural network and generative antagonism network model
Summary
With the improvement of people’s material life, mental health issues have received widespread attention. Clinical observations and studies have found that there is a significant correlation between the audio characteristics and the depression degrees [4], [5]. Z. Wang et al.: Recognition of Audio Depression Based on CNN and Generative Antagonism Network Model has been the focus of scholars due to its advantages of low cost, easy collection and non-contact [9], [10]. Compared with traditional machine learning methods, deep learning models can extracting high-level semantic features based on the neural network framework, which has brought breakthough progress in recent years. We proposed a novel deep learning algorithm which combine convolutional neural network (CNN) and generative antagonism network (GAN) and for ASDD. This paper focuses on these three problems, and proposes an audio depression recognition method based on convolution neural network and generative antagonism network model
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.