Abstract
In this correspondence, we present some preliminary results on using phonetic subword units in word recognition as compared to whole word templates. The phonetic subword units are specified as either phonelike units with and without temporal structure or as diphonelike units. The determination of these subword units requires segmentation, labeling, and parameter estimation at the same time, and is performed by an iterative two-stage algorithm consisting of nonlinear time alignment and parameter estimation. Experiments were carried out, using a connected digit recognition task, to study the usefulness of the subword unit representation and the effect on recognition performance of some versions of the subword specification. The best error rates for subword units are still, by a factor of 2 or more, larger than those for whole word templates.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Acoustics, Speech, and Signal Processing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.