Abstract
This study aimed to develop a classification model to detect and distinguish apathy and depression based on text, audio, and video features and to make use of the shapely additive explanations (SHAP) toolkit to increase the model interpretability. Subjective scales and objective experiments were conducted on 319 mild cognitive impairment (MCI) patients to measure apathy and depression. The MCI patients were classified into four groups, depression only, apathy only, depressed-apathetic, and the normal group. Speech, facial and text features were extracted using the open-source data analysis toolkits. Multiclass classification and SHAP toolkits were used to develop a classification model and explain the contribution of specific features. The macro-averaged f1 score and accuracy for overall model were 0.91 and 0.90, respectively. The accuracy for the apathetic, depressed, depressed-apathetic, and normal groups were 0.98, 0.88, 0.93, and 0.82, respectively. The SHAP toolkit identified speech features (Mel-frequency cepstral coefficient (MFCC) 4, spectral slopes, F0, F1), facial features (action unit (AU) 14, 26, 28, 45), and text feature (text 6 semantic) associated with apathy. Meanwhile, speech features (spectral slopes, shimmer, F0) and facial expression (AU 2, 6, 7, 10, 14, 26, 45) were associated with depression. Apart from the shared features mentioned above, new speech (MFCC 2, loudness) and facial (AU 9) features were observed in the depressive-apathetic group. Apathy and depression shared some verbal and facial features while also exhibited distinct features. A combination of text, audio, and video could be used to improve the early detection and differential diagnosis of apathy and depression in MCI patients.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have