Abstract

For depression severity assessment, we systematically analyze a modular deep learning pipeline that uses speech transcriptions as input for depression severity prediction. Through our pipeline, we investigate the role of popular deep learning architectures in creating representations for depression assessment. Evaluation of the proposed architectures is performed on the publicly available Extended Distress Analysis Interview Corpus dataset (E-DAIC). Through the results and discussions, we show that informative representations for depression assessment can be obtained without exploiting the temporal dynamics between descriptive text representations. More specifically, temporal pooling of latent representations outperforms the state of the art, which employs recurrent architectures, by 8.8% in terms of Concordance Correlation Coefficient (CCC).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call