Abstract

BackgroundMachine-learning methods using acoustic features in the diagnosis of major depressive disorder (MDD) have insufficient evidence from large-scale samples and clinical trials. This study aimed to evaluate the effectiveness of the promising i-vector method on a large sample of women with recurrent MDD diagnosed clinically, examine its robustness, and provide an explicit acoustic explanation of the i-vectors. MethodsWe collected utterances edited from clinical interview speech records of 785 depressed and 1,023 healthy individuals. Then, we extracted Mel-frequency cepstral coefficient (MFCC) features and MFCC i-vectors from their utterances. To examine the effectiveness of i-vectors, we compared the performance of binary logistic regression between MFCC i-vectors and MFCC features and tested its robustness on different utterance durations. We also determined the correlation between MFCC features and MFCC i-vectors to analyze the acoustic meaning of i-vectors. ResultsThe i-vectors improved 7% and 14% of area under the curve (AUC) for MFCC features using different utterances. When the duration is > 40 s, the classification results are stabilized. The i-vectors are consistently correlated to the maximum, minimum, and deviations of MFCC features (either positively or negatively). LimitationsThis study included only women. ConclusionsThe i-vectors can improve 14% of the AUC on a large-scale clinical sample. This system is robust to utterance duration > 40 s. This study provides a foundation for exploring the clinical application of voice features in the diagnosis of MDD.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call