On temporal alignment of sentences of natural and synthetic speech

H Hohne,C Coker,L Rabiner,S Levinson

doi:10.1109/tassp.1983.1164174

Abstract

One way to improve the quality of synthetic speech, and to learn about temporal aspects of speech recognition, is to study the problem of time aligning pairs of spoken sentences. For example, one could evaluate various sets of duration rules for synthesis by comparing the time alignments of speech sounds within synthetic sentences to those of naturally spoken sentences. In this manner, an improved set of sound duration rules could be obtained by applying some objective measure to the alignment scores. For speech recognition applications, one could obtain automatic labeling of continuous speech from a hand-marked prototype to obtain models and/or statistical data on sounds within sentences. A key question in the use of automatic alignment of sentence length utterances is whether the time warping methods, developed for isolated word recognition, could be extended to the problem of time aligning sentence length utterances (up to several seconds long). A second key question is the reliability and accuracy of such an alignment. In this paper we investigate these questions. It is shown that, with some simple modifications, the dynamic time warping procedures used for isolated word recognition apply almost as well to alignment of sentence length utterances. It is also shown that, on the average, the uncertainty in the location of significant events within the sentence is much smaller than the event durations although the largest errors are longer than some event durations. Hence, one must apply caution in using the time alignment contour for synthesis or recognition applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

On temporal alignment of sentences of natural and synthetic speech

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Acoustics, Speech, and Signal Processing

Lead the way for us

Journal: IEEE Transactions on Acoustics, Speech, and Signal Processing	Publication Date: Aug 1, 1983
Citations: 20

Similar Papers

Stacked Marginal Time Warping for Temporal Alignment
Xiang Zhang ... Xuhui Huang
Neural Processing Letters | VOL. 49
Xiang Zhang, et. al.Xiang Zhang ... Xuhui Huang
14 May 2018
Neural Processing Letters | VOL. 49

A Time-Weighted Dynamic Time Warping Method for Land-Use and Land-Cover Mapping
Victor Maus ... Gilberto Camara
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing | VOL. 9
Victor Maus, et. al.Victor Maus ... Gilberto Camara
01 Aug 2016
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing | VOL. 9

Talker-specific learning in speech perception.
Lynne C Nygaard ... David B Pisoni
Perception & Psychophysics | VOL. 60
Lynne C Nygaard, et. al.Lynne C Nygaard ... David B Pisoni
01 Jan 1998
Perception & Psychophysics | VOL. 60

Functional Convex Averaging and Synchronization for Time-Warped Random Curves
Xueli Liu ... Hans-Georg Müller
Journal of the American Statistical Association | VOL. 99
Xueli Liu, et. al.Xueli Liu ... Hans-Georg Müller
01 Sep 2004
Journal of the American Statistical Association | VOL. 99

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

On temporal alignment of sentences of natural and synthetic speech

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Acoustics, Speech, and Signal Processing