Abstract

The key to content-based video retrieval is the automatic detection and annotation of semantic events. In view of the problem that the results of existing research on snooker video analysis cannot satisfy event detection needs, we propose a system for event detection for snooker game video based on a fusion of multimodal information. Our main contributions include the following: A full-table view detection method has been proposed that combines color and geometric features and can precisely locate the score bar and turn indicator. A new solution of text segmentation and recognition in score bar and turn indicator has been implemented by designing an official player database to make up the deficiency of the optical character recognition (OCR). An audio classification approach using the hidden Markov model has been proposed to recognize applause, laughter, sighs, and the sounds of shots in replays. In visual modality, we detect replays using the method of optical flow block matching, based on dynamic programming. Through multimodal fusion of visual, audio, and text information and domain knowledge of snooker, we realized the detection algorithms for nine semantic events: frames, high breaks, defensive counterattacks, long considerations, fouls, nice shots, nice safeties, faults, and funny events. The experimental results show that the proposed method achieves high performance for most semantic events. And a comparison with other methods has demonstrated the performance superiority of our proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call