Abstract

With the fast development of high-speed network and digital video recording technologies, broadcast video has been playing a more and more important role in our daily life. In this paper, we propose a novel news story segmentation scheme which can segment broadcast video into story units with multi-modal information fusion (MMIF) strategy. Compared with traditional methods, the proposed scheme extracts a wealth of semantic-level features including anchor person, topic caption, face, silence, acoustic change, audio keywords and textual content. Parallel to this, we make use of a multi-modal information fusion strategy for news story boundary characterization by joining these visual, audio and textual cues. Encouraging experimental results on News Vision dataset demonstrate the effectiveness of the proposed scheme.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call