Attention guided learnable time-domain filterbanks for speech depression detection

Wenju Yang,Jian K Liu,Jiankang Liu,Peng Cao,Rongxin Zhu,Yang Wang,Fei Wang,Xizhe Zhang

doi:10.1016/j.neunet.2023.05.041

Abstract

Depression, as a global mental health problem, is lacking effective screening methods that can help with early detection and treatment. This paper aims to facilitate the large-scale screening of depression by focusing on the speech depression detection (SDD) task. Currently, direct modeling on the raw signal yields a large number of parameters, and the existing deep learning-based SDD models mainly use the fixed Mel-scale spectral features as input. However, these features are not designed for depression detection, and the manual settings limit the exploration of fine-grained feature representations. In this paper, we learn the effective representations of the raw signals from an interpretable perspective. Specifically, we present a joint learning framework with attention-guided learnable time-domain filterbanks for depression classification (DALF), which collaborates with the depression filterbanks features learning (DFBL) module and multi-scale spectral attention learning (MSSA) module. DFBL is capable of producing biologically meaningful acoustic features by employing learnable time-domain filters, and MSSA is used to guide the learnable filters to better retain the useful frequency sub-bands. We collect a new dataset, the Neutral Reading-based Audio Corpus (NRAC), to facilitate the research in depression analysis, and we evaluate the performance of DALF on the NRAC and the public DAIC-woz datasets. The experimental results demonstrate that our method outperforms the state-of-the-art SDD methods with an F1 of 78.4% on the DAIC-woz dataset. In particular, DALF achieves F1 scores of 87.3% and 81.7% on two parts of the NRAC dataset. By analyzing the filter coefficients, we find that the most important frequency range identified by our method is 600–700Hz, which corresponds to the Mandarin vowels /e/ and /eˆ/ and can be considered as an effective biomarker for the SDD task. Taken together, our DALF model provides a promising approach to depression detection.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Attention guided learnable time-domain filterbanks for speech depression detection

Abstract

Talk to us

Similar Papers

More From: Neural networks : the official journal of the International Neural Network Society

Lead the way for us

Journal: Neural networks : the official journal of the International Neural Network Society	Publication Date: May 26, 2023
Citations: 5

Similar Papers

A case study on decompounding in Indian language IR
Siba Sankar Sahu ... Sukomal Pal
Natural Language Processing | VOL. -
Siba Sankar Sahu, et. al.Siba Sankar Sahu ... Sukomal Pal
03 Jun 2024
Natural Language Processing | VOL. -

DTranNER: biomedical named entity recognition with deep learning-based label-label transition model
S. K. Hong ... Jae-Gil Lee
BMC bioinformatics | VOL. 21
S. K. Hong, et. al.S. K. Hong ... Jae-Gil Lee
11 Feb 2020
BMC bioinformatics | VOL. 21

An abbreviated review of deep learning-based image classification models
Zaman Talal Abbood ... Layth Kamil Adday Almajmaie
Indonesian Journal of Electrical Engineering and Computer Science | VOL. 30
Zaman Talal Abbood, et. al.Zaman Talal Abbood ... Layth Kamil Adday Almajmaie
01 Apr 2023
Indonesian Journal of Electrical Engineering and Computer Science | VOL. 30

DeepAMO: a multi-slice, multi-view anthropomorphic model observer for visual detection tasks performed on volume images.
Ye Li ... S. Ted Treves
Journal of Medical Imaging | VOL. 8
Ye Li, et. al.Ye Li ... S. Ted Treves
28 Jan 2021
Journal of Medical Imaging | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Attention guided learnable time-domain filterbanks for speech depression detection

Abstract

Talk to us

Similar Papers

More From: Neural networks : the official journal of the International Neural Network Society