Abstract
This paper is concerned with posture recognition using ensemble convolutional neural networks (CNNs) in home environments. With the increasing number of elderly people living alone at home, posture recognition is very important for helping elderly people cope with sudden danger. Traditionally, to recognize posture, it was necessary to obtain the coordinates of the body points, depth, frame information of video, and so on. In conventional machine learning, there is a limitation in recognizing posture directly using only an image. However, with advancements in the latest deep learning, it is possible to achieve good performance in posture recognition using only an image. Thus, we performed experiments based on VGGNet, ResNet, DenseNet, InceptionResNet, and Xception as pre-trained CNNs using five types of preprocessing. On the basis of these deep learning methods, we finally present the ensemble deep model combined by majority and average methods. The experiments were performed by a posture database constructed at the Electronics and Telecommunications Research Institute (ETRI), Korea. This database consists of 51,000 images with 10 postures from 51 home environments. The experimental results reveal that the ensemble system by InceptionResNetV2s with five types of preprocessing shows good performance in comparison to other combination methods and the pre-trained CNN itself.
Highlights
Posture recognition is a technology that classifies and identifies the posture of a person and has received considerable attention in the field of computer vision
convolutional neural networks (CNNs) are designed by integrating a feature extractor and a classifier into a network to automatically train them through data and exhibit the optimal performance for image processing [18]
There are many pre-trained deep models based on the CNN, such as VGGNet [19], ResNet [20], DenseNet [21], InceptionResNet [22], and Xception [23]
Summary
Posture recognition is a technology that classifies and identifies the posture of a person and has received considerable attention in the field of computer vision. CNNs are designed by integrating a feature extractor and a classifier into a network to automatically train them through data and exhibit the optimal performance for image processing [18]. To recognize posture, it was necessary to obtain the coordinates of the body points or inertial data This was achieved using a depth camera such as Kinect, image processing through a body model, or devices for capturing motion connected to the body; regarding the latter, it is a nuisance to wear these sensors with care in everyday life. Since posture recognition is performed using images, it can be applied to inexpensive cameras, and the device used for acquiring experimental data has an inexpensive feature even though it supports a depth camera.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have