Abstract
For the increasing number of patients with depression, this paper proposes an artificial intelligence method to effectively identify depression through voice signals, with the aim of improving the efficiency of diagnosis and treatment. Firstly, a pre-training model called wav2vec 2.0 is fine-tuned to encode and contextualize the speech, thereby obtaining high-quality voice features. This model is applied to the publicly available dataset - the distress analysis interview corpus-wizard of OZ (DAIC-WOZ). The results demonstrate a precision rate of 93.96%, a recall rate of 94.87%, and an F1 score of 94.41% for the binary classification task of depression recognition, resulting in an overall classification accuracy of 96.48%. For the four-class classification task evaluating the severity of depression, the precision rates are all above 92.59%, the recall rates are all above 92.89%, the F1 scores are all above 93.12%, and the overall classification accuracy is 94.80%. The research findings indicate that the proposed method effectively enhances classification accuracy in scenarios with limited data, exhibiting strong performance in depression identification and severity evaluation. In the future, this method has the potential to serve as a valuable supportive tool for depression diagnosis.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Sheng wu yi xue gong cheng xue za zhi = Journal of biomedical engineering = Shengwu yixue gongchengxue zazhi
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.