Automatic discovery of topics and acoustic morphemes from speech

Christophe Cerisara

doi:10.1016/j.csl.2008.06.004

Abstract

This work deals with automatic lexical acquisition and topic discovery from a speech stream. The proposed algorithm builds a lexicon enriched with topic information in three steps: transcription of an audio stream into phone sequences with a speaker- and task-independent phone recogniser, automatic lexical acquisition based on approximate string matching, and hierarchical topic clustering of the lexical entries based on a knowledge-poor co-occurrence approach. The resulting semantic lexicon is then used to automatically cluster the incoming speech stream into topics. The main advantages of this algorithm are its very low computational requirements and its independence to pre-defined linguistic resources, which makes it easy to port to new languages and to adapt to new tasks. It is evaluated both qualitatively and quantitatively on two corpora and on two tasks related to topic clustering. The results of these evaluations are encouraging and outline future directions of research for the proposed algorithm, such as building automatic orthographic labels of the lexical items.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Automatic discovery of topics and acoustic morphemes from speech

Abstract

Talk to us

Similar Papers

More From: Computer Speech & Language

Lead the way for us

Journal: Computer Speech & Language	Publication Date: Jul 8, 2008
Citations: 8

Similar Papers

Average-Case Optimal Approximate Circular String Matching
Carl Barton ... Costas S Iliopoulos
-
Carl Barton, et. al.Carl Barton ... Costas S Iliopoulos
01 Jan 2015
01 Jan 2015

Business Process Automation: A Workflow Incorporating Optical Character Recognition and Approximate String and Pattern Matching for Solving Practical Industry Problems
Coenrad De Jager ... Marinda Nel
Applied System Innovation | VOL. 2
Coenrad De Jager, et. al.Coenrad De Jager ... Marinda Nel
24 Oct 2019
Applied System Innovation | VOL. 2

LibFLASM: a software library for fixed-length approximate string matching.
Lorraine A K Ayad ... Ahmad Retha
BMC Bioinformatics | VOL. 17
Lorraine A K Ayad, et. al.Lorraine A K Ayad ... Ahmad Retha
10 Nov 2016
BMC Bioinformatics | VOL. 17

Approximate String Matching Algorithms: A Brief Survey and Comparison
Syeda Shabnamhasan ... Rosina Surovi Khan
International Journal of Computer Applications | VOL. 120
Syeda Shabnamhasan, et. al.Syeda Shabnamhasan ... Rosina Surovi Khan
18 Jun 2015
International Journal of Computer Applications | VOL. 120

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic discovery of topics and acoustic morphemes from speech

Abstract

Talk to us

Similar Papers

More From: Computer Speech & Language