Abstract

AbstractIn this paper, a phonemic segmentation system is constructed which will be useful in practice as a preprocessing for English continuous speech recognition. the usefulness of the system is demonstrated. English continuous speech contains many weakly voiced vowels with ambiguous utterances called schwa, making stable detection of the phonemic boundary difficult. In fact, there has been almost no detailed proposal for a segmentation system that can be used in practice as a preprocessing to the recognition.In the proposed method, the mel cepstrum is extracted from the speech signal, using the unbiased estimation of the log spectrum. This process is known to be stable, i.e., less affected by the fine spectral structure. Then the dynamic segmentation parameters are determined from the mel cepstrum such that the boundary of the schwa phonemes can be detected using the pseudo‐differentiation filter. By combining the dynamic parameters with the static parameters, the phonemewise segmentation is executed in a hierarchical form.The proposed system can segment English continuous speech containing diversified phonemic environments into phonemes on the time axis. the system operates based only on the acoustic knowledge of phonemes, which are common to speakers with different qualities and utterances, without using the speaker‐dependent complex boundary detection rules. Also, the proposed system is evaluated by experiment using English continuous phoneme‐balanced speech uttered by one each native English‐speaking male and female for 350 s. For the total number of phonemes 3024, the detect ratio for the phoneme boundary is 97.1 percent, the boundary delete ratio is 2.9 percent, and the boundary add ratio is 24.2 percent.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.