Topic segmentation on spoken documents using self-validated acoustic cuts

Hongjie Chen,Wei Feng,Yanning Zhang,Lilei Zheng,Lei Xie

doi:10.1007/s00500-014-1383-9

Abstract

Topic segmentation serves as a necessary prerequisite for multimedia content analysis and management. The normalized cuts (NCuts) approach has shown superior performance in topic segmentation of spoken document. However, in this method, the number of topics in a document has to be known prior to segmentation. This is impractical for real-world applications with exponential growth of multimedia data. On the other hand, previous lexical-based spoken document segmentation approaches, including NCuts, work on text transcripts generated by a large vocabulary continuous speech recognizer (LVCSR). As we know, training such a recognizer requires a large amount of transcribed speech data and language-specific knowledges. Moreover, inevitable speech recognition errors and the out-of-vocabulary (OOV) problem apparently affect the segmentation performance. This paper addresses these problems by a self-validated acoustic normalized cuts approach, namely SACuts. First, as compared with NCuts, our approach can determine the topic number in a spoken document automatically without extra computation load. Second, as compared with lexical approaches that rely on a high-resource speech recognizer, our approach can achieve comparable and even better segmentation performance using only acoustic-level information. Evaluation on a broadcast news topic segmentation task shows the superiority of the proposed approach.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Topic segmentation on spoken documents using self-validated acoustic cuts

Abstract

Talk to us

Similar Papers

More From: Soft Computing

Lead the way for us

Journal: Soft Computing	Publication Date: Jul 25, 2014
Citations: 6

Similar Papers

Comparing neural sentence encoders for topic segmentation across domains: not your typical text similarity task
Iacopo Ghinassi ... Matthew Purver
PeerJ Computer Science | VOL. 9
Iacopo Ghinassi, et. al.Iacopo Ghinassi ... Matthew Purver
03 Nov 2023
PeerJ Computer Science | VOL. 9

Continuous space language models
Holger Schwenk
Computer Speech & Language | VOL. 21
Holger SchwenkHolger Schwenk
09 Oct 2006
Computer Speech & Language | VOL. 21

Large Vocabulary Continuous Speech Recognizer for Slovenian Language
Andrej Žgank ... Zdravko Kačič
-
Andrej Žgank, et. al.Andrej Žgank ... Zdravko Kačič
11 Sep 2001
11 Sep 2001

Speaker-independent upfront dialect adaptation in a large vocabulary continuous speech recognizer
Volker Fischer ... Yuqing Gao
-
Volker Fischer, et. al.Volker Fischer ... Yuqing Gao
30 Nov 1998
30 Nov 1998

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Topic segmentation on spoken documents using self-validated acoustic cuts

Abstract

Talk to us

Similar Papers

More From: Soft Computing