Indexing, Object Segmentation, and Event Detection in News and Sports Videos

Paisarn Muneesawang,Ning Zhang,Ling Guan

doi:10.1007/978-3-319-11782-9_7

Abstract

A video parsing algorithm in the compressed domain is first introduced in this chapter. The algorithm is based on the conventional solution, where energy histograms of DC coefficients are used to calculate the distance between consecutive I/P frames, and the DC coefficients of the P-frames are obtained by frame conversion. The detection results are enhanced by using the ratio between two sliding windows to amplify the transitional regions. Secondly, in order to index news video at various levels, a template-frequency model is utilized to characterize the spatio-temporal information of news stories. The system employing this indexing structure is highly applicable for news-on-demand applications. Thirdly, a method for video object segmentation using Graph Cut and histogram of oriented gradients is presented. This method enhances the segmentation of objects that do not segment well, due to either poor luminance distribution, weak edges, or backgrounds with similar color and movement. Fourthly, the chapter presents an automatic and robust method to detect human faces from video sequences that combines feature extraction and face detection based on local normalization, Gabor wavelet transform, and AdaBoost algorithm. Finally, an application system is presented for the classification of American Football videos according to events of interest. The system consists of two stages. The first stage is responsible for play event localization and the latter stage is responsible for feature mapping and classification. The first stage employs MPEG-7 motion activity descriptors to detect the starting point of a play event, whereas the second stage uses MPEG-7 motion and audio descriptors along with Mel Frequency Cepstrum Coefficient features to classify the events using Fisher’s LDA.

Full Text