Abstract

The three-dimensional mathematiccal model can well represent the multi-dimensional information of continuous video stream data, and is an important means in video classification research. In order to improve the accuracy of video classification, based on three-dimensional convolutional network, this paper proposes an improved video stream classification method by using video features of different time scales. Firstly, the video frame sequences of different time scales are inputting into the 3D network for feature extraction, and secondly, the features of different time scales are weighted and fused. Then, after merging multiple sequential models, the fused three-dimensional convolutional neural network model is constructed by fully connected, and the influence of optical flow input on video classification results is weakened. By classifying the 101 action behaviors of the UCF-101 video stream dataset, the results show that the improved network model obtains 90.6% classification accuracy on the UCF-101 dataset, and the classification accuracy compared with the 3DConv_ Ensemble model [7] is improved 2.9%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call