Abstract

In recent years, due to the sudden outbreak of public health events, online teaching has become a mainstream teaching approach, and the number of teaching videos has increased rapidly. Therefore, extracting active image information from videos is of great importance in understanding video. This research proposes extracting image features from the spatiotemporal dimension based on deep learning, usinga spatiotemporal network for action recognition of skeletal action, and building a CSTGAT model based on a convolutional neural network. The experimental results show that the CSTGAT model has an accuracy of 98.47%, a precision rate of 97.43%, and a recall rate of 71.65% after being trained by the convolutional neural network. Furthermore, it only needs 217 iterations to achieve stable target convergence. After 100 tests, the F1 value of the CSTGAT model was 96.83%. In summary, the proposed model has high accuracy, a comprehensive query rate, and good model expressiveness. This model could provide a solution for intelligent longdistance interaction between a human and a machine and could be used in online teaching.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.