Abstract

Unlike most of the existing neural network-based fall detection methods, which only detect fall at the time range, the algorithm proposed in this paper detect fall in both spatial and temporal dimension. A movement tube detection network integrating 3D CNN and object detection framework such as SSD is proposed to detect human fall with constrained movement tubes. The constrained movement tube, which encapsulates the person with a sequence of bounding boxes, has the merits of encapsulating the person closely and avoiding peripheral interference. A 3D convolutional neural network is used to encode the motion and appearance features of a video clip, which are fed into the tube anchors generation layer, softmax classification, and movement tube regression layer. The movement tube regression layer fine tunes the tube anchors to the constrained movement tubes. A large-scale spatio-temporal (LSST) fall dataset is constructed using self-collected data to evaluate the fall detection in both spatial and temporal dimensions. LSST has three characteristics of large scale, annotation, and posture and viewpoint diversities. Furthermore, the comparative experiments on a public dataset demonstrate that the proposed algorithm achieved sensitivity, specificity an accuracy of 100%, 97.04%, and 97.23%, respectively, outperforms the existing methods.

Highlights

  • Inspired by [5,8,9], in this paper, a 2D object detection framework such as the SSD is extended to the movement tube detection network for human fall in both spatial and temporal dimensions simultaneously

  • The movement tube detection network consists of three components: 3D ConvNet, a tube anchors generation layer, and an output layer

  • Movement tube regression layer similar those of the object detection framework. In this the bounding boxes generation layer andlayer boxesand regression layer in the object detection network, the bounding boxes generation boxes regression layer in the object framework is extended to the tube layer andlayer movement tube anchors detection framework is extended to anchors the tube generation anchors generation and movement tube generation layer, respectively

Read more

Summary

Introduction

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. Inspired by [5,8,9], in this paper, a 2D object detection framework such as the SSD is extended to the movement tube detection network for human fall in both spatial and temporal dimensions simultaneously. The LSST fall detection dataset aims to provide a data benchmark in the field of vision-based human fall detection in both spatial and temporal dimensions. The collected dataset is the first dataset which contains a large scale of videos annotating with bounding boxes in the field of human fall detection. A movement tube detection network is proposed to detect a human fall in both spatial and temporal dimensions simultaneously. The LSST fall detection dataset aims to provide a data benchmark to encourage further research into human fall detection in both spatial and temporal dimensions.

Related Work
The Overview of Proposed Method
Movement tube regression layer
The Structure of the Proposed Neural Network
Loss detection
Data Augmentation
Post-Processing
2: Output
Existing Fall Detection Datasets
Experiments and Discussion of 18
Implementation
The hyper-parameter the mini-batch
Ablation
Comparison the State the Art
Comparison to the State of the Art
Method
The Result of the Proposed Method
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call