Show me where the action is!

Timothy Callemein,Wim Boes,Luc Van Eycken,Ali Diba,Tinne Tuytelaars,Hugo Van Hamme,Toon Goedemé,Tom Roussel,Luc Van Gool,Floris De Feyter

doi:10.1007/s11042-020-09616-9

Abstract

Reality TV shows have gained popularity, motivating many production houses to bring new variants for us to watch. Compared to traditional TV shows, reality TV shows have spontaneous unscripted footage. Computer vision techniques could partially replace the manual labour needed to record and process this spontaneity. However, automated real-world video recording and editing is a challenging topic. In this paper, we propose a system that utilises state-of-the-art video and audio processing algorithms to, on the one hand, automatically steer cameras, replacing camera operators and on the other hand, detect all audiovisual action cues in the recorded video, to ease the job of the film editor. This publication has hence two main contributions. The first, automating the steering of multiple Pan-Tilt-Zoom PTZ cameras to take aesthetically pleasing medium shots of all the people present. These shots need to comply with the cinematographic rules and are based on the poses acquired by a pose detector. Secondly, when a huge amount of audio-visual data has been collected, it becomes labour intensive for a human editor retrieve the relevant fragments. As a second contribution, we combine state-of-the-art audio and video processing techniques for sound activity detection, action recognition, face recognition, and pose detection to decrease the required manual labour during and after recording. These techniques used during post-processing produce meta-data allowing for footage filtering, decreasing the search space. We extended our system further by producing timelines uniting generated meta-data, allowing the editor to have a quick overview. We evaluated our system on three in-the-wild reality TV recording sessions of 24 hours (× 8 cameras) each taken in real households.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Show me where the action is!

Abstract

Talk to us

Similar Papers

More From: Multimedia Tools and Applications

Lead the way for us

Journal: Multimedia Tools and Applications	Publication Date: Sep 2, 2020
License type: open-access

Similar Papers

Pose Detection for Partially Occluded Persons in Spectator Crowds
Arif Mahmood ... Mubarak Shah
-
Arif Mahmood, et. al.Arif Mahmood ... Mubarak Shah
01 Jan 2015
01 Jan 2015

Action analysis and video summarisation to efficiently manage and interpret video data
Johanna Carvajal Gonzalez
-
Johanna Carvajal GonzalezJohanna Carvajal Gonzalez
18 Nov 2016
18 Nov 2016

Cloud-based platform for computer vision applications
Sidi Ahmed Mahmoudi ... Fabian Lecron
-
Sidi Ahmed Mahmoudi, et. al.Sidi Ahmed Mahmoudi ... Fabian Lecron
21 Jul 2017
21 Jul 2017

Audio and Speech Processing with MATLAB
Paul Hill
-
Paul HillPaul Hill
07 Dec 2018
07 Dec 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Show me where the action is!

Abstract

Talk to us

Similar Papers

More From: Multimedia Tools and Applications