Breaking Age Barriers With Automatic Voice-Based Depression Detection

Brian Stasak,Dale Joachim,Julien Epps

doi:10.1109/mprv.2022.3163656

Abstract

Adults over the age of 60 years are a rising population at-risk for depression, and there is a need to create automatic screening for this illness. Most existing voice-based depression datasets comprise speakers younger than 60 and variations in speech due to age and depression are not well understood. In this study, which uses Patient Health Questionnaires for depression severity ground-truth, automatic depression detection is explored using acoustic-based prosodic, spectral, landmark, and voice quality features derived from smartphone recordings from 152 speakers in four different age ranges (e.g., 18–34, 35–48, 49–62, and 63–79). An age-dependent modeling paradigm for voice-based depression detection is proposed and evaluated. Results show that age-dependent models improve voice-based automatic depression classification accuracy with up to 10% absolute gains when compared with an age-agnostic model. Further, when compared with age-agnostic and gender-dependent models, age-dependent models often produced greater depressed class identification f-score sensitivity (up to 0.39 absolute gains). While automatically extracted acoustic voice features lead to statistically significant depression detection accuracy gains over the age-agnostic modeling baseline (4%–9% absolute), manually extracted voice quality features also are useful (4%–7% absolute gains over baseline). This investigation demonstrates the benefits of implementing age modeling to improve voice-based depression screening via smart devices.

Full Text