Information extraction from broadcast news

Yoshihiko Gotoh,Steve Renals

doi:10.1098/rsta.2000.0587

Information extraction from broadcast news

Yoshihiko Gotoh, Steve Renals

Open Access

https://doi.org/10.1098/rsta.2000.0587

Copy DOI

Journal: Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences	Publication Date: Apr 15, 2000
Citations: 24

Affiliation: University of Sheffield

#Radio News Broadcasts #Sparse Training Data + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

This paper discusses the development of trainable statistical models for extracting content from television and radio news broadcasts. In particular we concentrate on statistical finite state models for identifying proper names and other named entities in broadcast speech. Two models are presented: the first represents name class information as a word attribute; the second represents both word-word and class-class transitions explicitly. A common n-gram based formulation is used for both models. The task of named entity identification is characterized by relatively sparse training data and issues related to smoothing are discussed. Experiments are reported using the DARPA/NIST Hub-4E evaluation for North American Broadcast News.

Full Text