Unsupervised optimal phoneme segmentation: theory and experimental evaluation

Yu Qiao,Nobuaki Minematsu,Dean Luo

doi:10.1049/iet-spr.2012.0191

Abstract

Automatic phoneme segmentation of a speech sequence is a basic problem in speech engineering. This study investigates unsupervised phoneme segmentation without using prior information on linguistic contents and acoustic models of an input sequence. The authors formulate the unsupervised segmentation as an optimal problem by means of maximum likelihood, and show that the optimal segmentation corresponds to minimising the coding length of the input sequence. Under different assumptions, five different objective functions are developed, namely log determinant, rate distortion (RD), Bayesian log determinant, Mahalanobis distance and Euclidean distance objectives. The authors prove that the optimal segmentations have the transformation-invariant properties, introduce a time-constrained agglomerative clustering algorithm to find the optimal segmentations, and propose an efficient implementation of the algorithm by using integration functions. The experiments are carried out on the TIMIT database to compare the above five objective functions. The results show that RD achieves the best performance, and the proposed method outperforms the previous unsupervised segmentation methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Unsupervised optimal phoneme segmentation: theory and experimental evaluation

Abstract

Talk to us

Similar Papers

More From: IET Signal Processing

Lead the way for us

Journal: IET Signal Processing	Publication Date: Sep 1, 2013
Citations: 3

Similar Papers

Unsupervised optimal phoneme segmentation: Objectives, algorithm and comparisons
Yu Qiao ... Naoya Shimomura
-
Yu Qiao, et. al. Yu Qiao ... Naoya Shimomura
01 Mar 2008
01 Mar 2008

Comparing supervised and unsupervised multiresolution segmentation approaches for extracting buildings from very high resolution imagery
Mariana Belgiu ... Lucian Drǎguţ
ISPRS Journal of Photogrammetry and Remote Sensing | VOL. 96
Mariana Belgiu, et. al.Mariana Belgiu ... Lucian Drǎguţ
28 Jul 2014
ISPRS Journal of Photogrammetry and Remote Sensing | VOL. 96

Say Cheese vs. Smile
Yelin Kim ... Emily Mower Provost
-
Yelin Kim, et. al.Yelin Kim ... Emily Mower Provost
03 Nov 2014
03 Nov 2014

Unsupervised Multiple Object Segmentation of Multiview Images
Wenxian Yang ... King Ngi Ngan
-
Wenxian Yang, et. al.Wenxian Yang ... King Ngi Ngan
28 Aug 2007
28 Aug 2007

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Unsupervised optimal phoneme segmentation: theory and experimental evaluation

Abstract

Talk to us

Similar Papers

More From: IET Signal Processing