Abstract
Falls often pose significant safety risks to solitary individuals, especially the elderly. Implementing a fast and efficient fall detection system is an effective strategy to address this hidden danger. We propose a multimodal method based on audio and video. On the basis of using non-intrusive equipment, it reduces to a certain extent the false negative situation that the most commonly used video-based methods may face due to insufficient lighting conditions, exceeding the monitoring range, etc. Therefore, in the foreseeable future, methods based on audio and video fusion are expected to become the best solution for fall detection. Specifically, this article outlines the following methodology: the video-based model utilizes YOLOv7-Pose to extract key skeleton joints, which are then fed into a two stream Spatial Temporal Graph Convolutional Network (ST-GCN) for classification. Meanwhile, the audio-based model employs log-scaled mel spectrograms to capture different features, which are processed through the MobileNetV2 architecture for detection. The final decision fusion of the two results is achieved through linear weighting and Dempster-Shafer (D-S) theory. After evaluation, our multimodal fall detection method significantly outperforms the single modality method, especially the evaluation metric sensitivity increased from 81.67% in single video modality to 96.67% (linear weighting) and 97.50% (D-S theory), which emphasizing the effectiveness of integrating video and audio data to achieve more powerful and reliable fall detection in complex and diverse daily life environments.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.