Abstract

Background and Objective: By 2030, depression is projected to become the predominant mental disorder. With the rising prominence of depression, a great number of affective computing studies has been observed, with the majority emphasizing the use of audiovisual methods for estimating depression scales. Present studies often overlook the potential patterns of sequential data and not adopt the fine-grained features of Transformer to model the behavior features for video-based depression recognition (VDR). Methods:To address above-mentioned gaps, we present an end-to-end sequential framework called Depressformer for VDR. This innovative structure is delineated into the three structures: the Video Swin Transformer (VST) for deep feature extraction, a module dedicated to depression-specific fine-grained local feature extraction (DFLFE), and the depression channel attention fusion (DCAF) module to fuse the latent local and global features. By utilizing the VST as a backbone network, it is possible to discern pivotal features more effectively. The DFLFE enriches this process by focusing on the nuanced local features indicative of depression. To enhance the modeling of combined features pertinent to VDR, DCAF module is also presented. Results:Our methodology underwent extensive validations using the AVEC2013/2014 depression databases. The empirical results underscore its efficacy, yielding a root mean square error (RMSE) of 7.47 and a mean absolute error (MAE) of 5.49 for the first dataset. For the second database, the corresponding values were 7.22 and 5.56, respectively. And the F1-score is 0.59 on the D-vlog dataset. Conclusions:In summary, the experimental evaluations suggest that Depressformer architecture demonstrates superior performances with stability and adaptability across various tasks, making it capable of effectively identifying the severity of depression.Code will released at the link: https://github.com/helang818/Depressformer/.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.