The advancement of computer vision technology has led to the development of sophisticated algorithms capable of accurately recognizing human actions from red-green-blue videos recorded by drone cameras. Hence, possessing an exceptional potential, human action recognition also faces many challenges including, tendency of humans to perform the same action in different ways, limited camera angles, and field of view. In this research article, a system has been proposed to tackle the forementioned challenges by using red-green-blue videos as input while the videos were recorded by drone cameras. First of all, the video was split into its constituent frames and then gamma correction was applied on each frame to obtain an optimized version of the image. Then the Felzenszwalb’s algorithm performed the segmentation to segment out human from the input image and human silhouette was generated. Utilizing the silhouette, skeleton was extracted to spot thirteen body key points. The key points were then used to perform elliptical modeling to estimate the individual boundaries of the body parts while the elliptical modeling was governed by the Gaussian mixture model-expectation maximization algorithm. The elliptical models of the body parts were utilized to spot fiducial points that if tracked, could provide very useful information about the performed action. Some other features that were extracted for this study include, the 3d point cloud feature vector, relative distance and velocity of the key-points, and their mutual angles. The features were then forwarded for optimization under a quadratic discriminant analysis and finally, a convolutional neural network was trained to perform the action classification. Three benchmark datasets including, the Drone-Action dataset, the UAV-Human dataset, and the Okutama-Action dataset were used for a comprehensive experimentation. The system outperformed the state-of-the-art approaches by securing accuracies of 80.03%, 48.60%, and 78.01% over the Drone-Action dataset, the UAV-Human dataset, and the Okutama-Action dataset respectively.
Read full abstract