Abstract

This article proposes a general solution for multi-granularity semantic analysis and extraction of video data based on the online instructional video of mathematics in colleges and universities of intelligent information technology. In this scheme, multi-level semantic analysis and multi-modal information fusion technology are unified and applied in the same model. This paper first proposes a method for detecting the gradual change of shots based on statistical distribution, and uses a key frame selection strategy with temporal semantic context constraints to represent the temporal content. After basic visual semantic recognition, a hierarchical approach is obtained. The multi-granularity visual semantic analysis extraction framework then uses the sound spectrum obtained by the time-frequency transformation as the observable feature, and constructs a hidden Markov model for semantic recognition of mathematical videos, which improves the efficiency by 7.93%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call