Abstract

In this paper, we aim to develop a method to create personalized and high-presence multi-channel contents for a sport game through realtime content curation from various media streams captured/created by spectators. We use the live TV broadcast as a ground truth data and construct a machine learning-based model to automatically conduct curation from multiple videos which spectators captured from different angles and zoom levels. The live TV broadcast of a baseball game has some curation rules which select a specific angle camera for some specific scenes (e.g., a pitcher throwing a ball). As inputs for constructing a model, we use meta data such as image feature data (e.g., a pitcher is on the screen) in each fixed interval of baseball videos and game progress data (e.g., the inning number and the batting order). Output is the camera ID (among multiple cameras of spectators) at each point of time. For evaluation, we targeted Spring-Selection high-school baseball games. As training data, we used image features, game progress data, and the camera position at each point of time in the TV broadcast. We used videos of a baseball game captured from 7 different points in Hanshin Koshien Stadium with handy video cameras and generated sample data set by dividing the videos to fixed interval segments. We divided the sample data set into the training data set and the test data set and evaluated our method through two validation methods: (1) 10-fold crossvalidation method and (2) hold-out methods (e.g., learning first and second innings and testing third inning). As a result, our method predicted the camera switching timings with accuracy (F-measure) of 72.53% on weighted average for the base camera work and 92.1% for the fixed camera work.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call