Abstract

The aim of this research is to segment spontaneous speech using an unsupervised learning technique. We are especially interested from a machine perception or detection point-of-view, and focus on revealing some structure of prosody in spontaneous speech. The BEA spontaneous speech database is used to develop a speech segmentation system. The spontaneous narratives are annotated manually for intonational phrases (IP) and further divided for phonological phrases (PP). Word level transcription is also provided. For the automatic detection of IPs and embedded PPs, a two-step segmentation method is applied. In the first step, the IPs are detected automatically based on speech energy, spectral centroid and a double-thresholding technique. In the second step, PPs are segmented within the IPs, based on F0, energy and Kullback-Leibler divergence combined with an adaptive thresholding method. The results show that the proposed method can provide good and efficient framework for segmenting Hungarian spontaneous speech, with a performance close to read speech.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.