Robust information extraction from automatically generated speech transcriptions

David D Palmer,Mari Ostendorf,John D Burger

doi:10.1016/s0167-6393(00)00026-1

Robust information extraction from automatically generated speech transcriptions

David D Palmer, Mari Ostendorf + Show 1 more

https://doi.org/10.1016/s0167-6393(00)00026-1

Copy DOI

Journal: Speech Communication	Publication Date: Aug 14, 2000
Citations: 36

Affiliation: University of Washington, Mitre (United States)

#Spoken Language Data #Information Extraction + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

This paper describes a robust system for information extraction (IE) from spoken language data. The system extends previous hidden Markov model (HMM) work in IE, using a state topology designed for explicit modeling of variable-length phrases and class-based statistical language model smoothing to produce state-of-the-art performance for a wide range of speech error rates. Experiments on broadcast news data show that the system performs well with temporal and source differences in the data. In addition, strategies for integrating word-level confidence estimates into the model are introduced, showing improved performance by using a generic error token for incorrectly recognized words in the training data and low confidence words in the test data.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Speech Communication

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.