Abstract

A multi‐level segmentation of a one‐dimensional signal may be induced by hierarchical ordering of subsets in a corresponding parameter space. This concept has been introduced to design a segmentation algorithm that creates a special three‐level segmentation for the speech signal. For preclassification it uses the parameters short time prediction gain and short time variance to form the second level of a segment hierarchy containing the classes “pause,” “fricative,” “vocal,” and “nasal oriented.” By merging the segment classes “pause” and “fricative” as well as “vocal” and “nasal oriented” the first level is formed. Since the vocal parts comprise more than 50% of speech, a clustering procedure has been added to create a third level containing four classes that roughly correspond to four different vowel classes. The parameter vector for the clustering algorithm is the sampled LPC‐generated log power spectrum together with the L2 distance. Five samples of speech, with a duration of 1 min each, have been processed. The resulting segmentation served as a basis for a number of segment length statistics which suggest applications in speaker verification and speech coding. [Work supported by VW Foundation, at the University of Hannover, Germany.]

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.