Abstract

In this paper, a hierarchical and multi-modal based news item detection algorithm, which can be viewed as a mid-stage solution between the single-modal and the semantic-based approaches, is proposed for parsing TV news program videos. We investigate the production model of TV news program first and then make use of the so-obtained domain knowledge to develop the proposed algorithm. With the add of multi-modal features, such as volume and zero crossing rate in audios and keyframe and human face in videos, the proposed algorithm showed rather satisfactory results in both precision and recall measures for parsing a 6-hour news program test video.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call