Abstract

The EU-funded SpeechDat project was initiated in order to create large-scale speech databases for the development of voice-operated telecommunication services. This paper deals with the design of two such Swedish resources: 5000 speakers recorded over the fixed telephone network and 1000 speakers over the mobile network. Speakers were balanced according to gender, age and dialect. We also report on experiences from speaker recruitment. A “snowball” method, in which people gave addresses to friends according to a chain letter principle, was shown to be effective. Females were, in general, more cooperative than males. However, using Internet for recruiting favored young males. Statistics on speaker distribution are presented. Results regarding orthographic labeling of pronunciation, pronunciation errors and non-speech events are also included. The length of the longest word in a read sentence is shown to be directly correlated with mispronunciations and word repetitions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call