Abstract

College students suffer from depression due to factors such as education and graduation, and this phenomenon is increasing, but there is less research in this area. We studied depressive tendencies among Chinese students and used machine learning methods to detect depressive tendencies. The paper presents a Multi-Head Attention deep learning network for Speech Depression Recognition (SDR) using the Mel-frequency cepstral coefficient (MFCC) features as the input. The multi-head attention along with the convolutional neural network and the bidirectional long short-term memory network (CNN-BLSTM) embedding jointly attends to information from different representations of the same MFCC input sequence. The CNN-LSTM embedding helps in attending to the dominant depression features by identifying positions of the features in the sequence. In addition to Multi-Head Attention and CNN-LSTM embedding, we apply multi-task learning with gender recognition as an auxiliary task. The auxiliary task helps in learning the gender-specific features that influence the depression characteristics in speech and results in improved accuracy of Speech Depression Recognition, the primary task. We conducted all our experiments on Depression dataset. We can achieve an overall F1sorce of 91 % and average class accuracy of 92%, on SDR for depression classes.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.