Abstract
An automatic speech segmentation technique is presented that is based on the alignment of a target speech signal with a set of different reference speech signals generated by a specific designed corpus-based speech synthesis system that additionally generates phoneme boundary markers. Each reference signal is then warped to the target speech signal. By synthesizing and warping many different reference speech signals, each phoneme boundary of the target signal is characterized by a distribution of warped phoneme boundary positions. The boundary distributions are statistically and acoustically processed in order to generate the final segmentation. First, some problems related to manual and automatic phoneme segmentation are addressed. Then the technique of Statistical Corpus-based Segmentation (SCS) is introduced. Finally, intraand inter-speaker segmentation results are presented.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.