Abstract

Most previous studies on speech summarization focus on the extractive approaches. Yet directly concatenating the extracted speech utterances may not form a good summary due to the presence of disfluencies and redundancy in the unplanned spontaneous speech. In this paper, we proposed to generate compressed speech summaries by coupling the sentence level compression and summarization approaches, as a viable step towards generating abstractive summaries. We compared two utterance compression approaches: an unsupervised approach based on the Integer Linear Programming (ILP) framework, and a supervised method using conditional random fileds (CRF) that formulates the utterance compression problem as a sequence labeling task. We evaluated the compression performance using both human and ASR transcripts from the ICSI meeting corpus, and performed both automatic and human evaluation. Our results show that we can achieve reasonable utterance compression performance, and that the CRF-based method generally performs better. By coupling the compression and summarization approaches, we generated compressed speech summaries that cover more important information within the given length limit, yielding 5% absolute performance gain on both human and ASR transcripts as evaluated by the ROUGE-1 F-scores.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.