SPNet: A deep network for broadcast sports video highlight generation

Abdullah Aman Khan,Jie Shao

doi:10.1016/j.compeleceng.2022.107779

Abstract

Professionally broadcasted sports videos usually have long durations but contain only a few exciting events. In general, professional bodies and amateur content creators spend thousands of man-hours to manually crop the exciting video segments from these long-duration videos and generate handcrafted highlights. Sports enthusiasts keep them updated with the latest happening based on such highlights. There exists a need for a method that accurately and automatically recognizes the exciting activities in a sports game. To address this issue, we present a deep learning-based network SPNet that recognizes exciting sports activities by exploiting high-level visual feature sequences and automatically generates highlights. The proposed SPNet utilizes the strength of 3D convolution networks and Inception blocks for accurate activity recognition. We divide the sports video excitement into views, actions, and situations. Moreover, we provide 156 new annotations for about twenty-three thousand videos of the SP-2 dataset. Extensive experiments are conducted using two datasets SP-2 and C-sports, and the results demonstrate the superiority of the proposed SPNet. Our proposed method achieves the highest performance for views, action, and situation activities with an average accuracy of 76% on the SP-2 dataset and 82% on the C-sports dataset.

Full Text