Automatic multimedia indexing: combining audio, speech, and visual information to index broadcast news

K Ohtsuki,Y Matsuo,S Matsunaga,K Bessho,Y Hayashi

doi:10.1109/msp.2006.1621450

Abstract

This paper describes an indexing system that automatically creates metadata for multimedia broadcast news content by integrating audio, speech, and visual information. The automatic multimedia content indexing system includes acoustic segmentation (AS), automatic speech recognition (ASR), topic segmentation (TS), and video indexing features. The new spectral-based features and smoothing method in the AS module improved the speech detection performance from the audio stream of the input news content. In the speech recognition module, automatic selection of acoustic models achieved both a low WER, as with parallel recognition using multiple acoustic models, and fast recognition, as with the single acoustic model. The TS method using word concept vectors achieved more accurate results than the conventional method using local word frequency vectors. The information integration module provides the functionality of integrating results from the AS module, TS module, and SC module. The story boundary detection accuracy was improved by combining it with the AS results and the SC results compared to the sole TS results

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Automatic multimedia indexing: combining audio, speech, and visual information to index broadcast news

Abstract

Talk to us

Similar Papers

More From: IEEE Signal Processing Magazine

Lead the way for us

Journal: IEEE Signal Processing Magazine	Publication Date: Mar 1, 2006
Citations: 15

Similar Papers

Automatic indexing of multimedia content by integration of audio, spoken language, and visual information
K Ohtsuki ... Y Matsuo
-
K Ohtsuki, et. al.K Ohtsuki ... Y Matsuo
30 Nov 2003
30 Nov 2003

A Highly Adaptive Acoustic Model for Accurate Multi-dialect Speech Recognition
Sanghyun Yoo ... Inchul Song
-
Sanghyun Yoo, et. al.Sanghyun Yoo ... Inchul Song
01 May 2019
01 May 2019

Speech recognition system robust to noise and speaking styles
Shigeki Matsuda ... Satoshi Nakamura
-
Shigeki Matsuda, et. al.Shigeki Matsuda ... Satoshi Nakamura
04 Oct 2004
04 Oct 2004

Advances in automatic transcription of Italian broadcast news
Fabio Brugnara ... Mauro Cettolo
-
Fabio Brugnara, et. al.Fabio Brugnara ... Mauro Cettolo
16 Oct 2000
16 Oct 2000

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic multimedia indexing: combining audio, speech, and visual information to index broadcast news

Abstract

Talk to us

Similar Papers

More From: IEEE Signal Processing Magazine