Abstract
In current speech research, there is a need for large databases to be able to test production and perception models at different linguistic levels. There are considerable problems in administering databases, both to label the speech and to easily access stored material. In order to alleviate some of the problems we have created a speech analysis system. Speech data are stored in sentence-sized files. These files are segmented and transcribed semi-automatically given a phonetic transcription of the utterance. This transcription is generated by the letter-to-sound rules of our text-to-speech system. The emphasis on the database is the use for acoustic-phonetic research rather than the use in e.g. evaluation of speech recognizers. This makes demands on flexible and linguistically specified retrieval patterns. Our unorthodox solution to this is to use the synthesis rule structure, similar to the notation used in generative phonology, for accessing the data. By a brief rule statement, speech segments meeting the specified contextual conditions can be identified. Durational data can be collected directly during the database search. Spectral analysis programs operating with a variety of spectral representations have also been created that display the result, typically as a mean/standard deviation spectrum or as a contour histogram spectrum.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.