Towards computer-vision software tools to increase production and accessibility of video description for people with vision loss

Langis Gagnon,James Turner,Claude Chapdelaine,Denis Ouellet,Suzanne Mathieu,Samuel Foucher,Marc Lalonde,Maguelonne Heritier,Denis Laurendeau,Nath Tan Nguyen,David Byrns

doi:10.1007/s10209-008-0141-0

Abstract

This paper presents the status of a R&D project targeting the development of computer-vision tools to assist humans in generating and rendering video description for people with vision loss. Three principal issues are discussed: (1) production practices, (2) needs of people with vision loss, and (3) current system design, core technologies and implementation. The paper provides the main conclusions of consultations with producers of video description regarding their practices and with end-users regarding their needs, as well as an analysis of described productions that lead to propose a video description typology. The current status of a prototype software is also presented (audio-vision manager) that uses many computer-vision technologies (shot transition detection, key-frame identification, key-face recognition, key-text spotting, visual motion, gait/gesture characterization, key-place identification, key-object spotting and image categorization) to automatically extract visual content, associate textual descriptions and add them to the audio track with a synthetic voice. A proof of concept is also briefly described for a first adaptive video description player which allows end users to select various levels of video description.

Full Text