Abstract
In this paper, we study the performance of baseline hidden Markov model (HMM) for segmentation of speech signals. It is applied on single-speaker segmentation task, using Hindi speech database. The automatic phoneme segmentation framework evolved imitates the human phoneme segmentation process. A set of 44 Hindi phonemes were chosen for the segmentation experiment, wherein we used continuous density hidden Markov model (CDHMM) with a mixture of Gaussian distribution. The left-to-right topology with no skip states has been selected as it is effective in speech recognition due to its consistency with the natural way of articulating the spoken words. This system accepts speech utterances along with their orthographic “transcriptions” and generates segmentation information of the speech. This corpus was used to develop context-independent hidden Markov models (HMMs) for each of the Hindi phonemes. The system was trained using numerous sentences that are relevant to provide information to the passengers of the Metro Rail. The system was validated against a few manually segmented speech utterances. The evaluation of the experiments shows that the best performance is obtained by using a combination of two Gaussians mixtures and five HMM states. A category-wise phoneme error analysis has been performed, and the performance of the phonetic segmentation has been reported. The modeling of HMMs has been implemented using Microsoft Visual Studio 2005 (C++), and the system is designed to work on Windows operating system. The goal of this study is automatic segmentation of speech at phonetic level.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.