Abstract

In this paper, we propose a hybrid deep neural network model for recognizing human actions in videos. A hybrid deep neural network model is designed by the fusion of homogeneous convolutional neural network (CNN) classifiers. The ensemble of classifiers is built by diversifying the input features and varying the initialization of the weights of the neural network. The convolutional neural network classifiers are trained to output a value of one, for the predicted class and a zero, for all the other classes. The outputs of the trained classifiers are considered as confidence value for prediction so that the predicted class will have a confidence value of approximately 1 and the rest of the classes will have a confidence value of approximately 0. The fusion function is computed as the maximum value of the outputs across all classifiers, to pick the correct class label during fusion. The effectiveness of the proposed approach is demonstrated on UCF50 dataset resulting in a high recognition accuracy of 99.68%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.