Non-uniform Speaker Disentanglement For Depression Detection From Raw Speech Signals.

Jinhan Wang,Vijay Ravi,Abeer Alwan

doi:10.21437/interspeech.2023-2101

Abstract

While speech-based depression detection methods that use speaker-identity features, such as speaker embeddings, are popular, they often compromise patient privacy. To address this issue, we propose a speaker disentanglement method that utilizes a non-uniform mechanism of adversarial SID loss maximization. This is achieved by varying the adversarial weight between different layers of a model during training. We find that a greater adversarial weight for the initial layers leads to performance improvement. Our approach using the ECAPA-TDNN model achieves an F1-score of 0.7349 (a 3.7% improvement over audio-only SOTA) on the DAIC-WoZ dataset, while simultaneously reducing the speaker-identification accuracy by 50%. Our findings suggest that identifying depression through speech signals can be accomplished without placing undue reliance on a speaker's identity, paving the way for privacy-preserving approaches of depression detection.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Non-uniform Speaker Disentanglement For Depression Detection From Raw Speech Signals.

Abstract

Talk to us

Similar Papers

More From: Interspeech

Lead the way for us

Journal: Interspeech	Publication Date: Aug 20, 2023
Citations: 9

Similar Papers

Speaker Embeddings for Diarization of Broadcast Data In The Allies Challenge
Anthony Larcher ... Marie Tahon
-
Anthony Larcher, et. al.Anthony Larcher ... Marie Tahon
06 Jun 2021
06 Jun 2021

Estimating phoneme class conditional probabilities from raw speech signal using convolutional neural networks
Dimitri Palaz ... Mathew Magimai-Doss
-
Dimitri Palaz, et. al.Dimitri Palaz ... Mathew Magimai-Doss
25 Aug 2013
25 Aug 2013

Direct word discovery from speech signals based on hierarchical Dirichlet process-hidden language model and deep sparse autoencoder
Tadahiro Taniguchi ... Ryo Nakashima
-
Tadahiro Taniguchi, et. al.Tadahiro Taniguchi ... Ryo Nakashima
01 Sep 2016
01 Sep 2016

Residual Information in Deep Speaker Embedding Architectures
Adriana Stan
Mathematics | VOL. 10
Adriana StanAdriana Stan
23 Oct 2022
Mathematics | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Non-uniform Speaker Disentanglement For Depression Detection From Raw Speech Signals.

Abstract

Talk to us

Similar Papers

More From: Interspeech