Because the scene of football video is fixed and simple, the events in football video, such as shooting and offside , also have clear semantics. At the same time, they have sufficient domain knowledge and have broad application prospects. The research framework of sports video intelligence analysis is usually regarded as a three-level framework, namely, the low-level feature layer, the middle-level key primitive generation layer, and the high-level event analysis layer. A MF_O2T (Moving Feature Online Target Tracking) algorithm is proposed. First, based on the marked first frame image, this algorithm extracts a set of standardized local images from the target areas of visible and infrared images as target convolution filters, removes the main color of the stadium by using HSV color space nonuniform quantization algorithm, and extracts the histogram of the main color of players in upper and lower blocks. Experimental results show that the algorithm designed in this paper has strong robustness, can better adapt to player tracking in different scenes of football video, and meets the real-time requirements of football training.