Abstract
The paper focuses on the design and collection of a speech corpus of elemental speech units for AlpSynth, a corpus-driven Slovenian TTS system. We describe the design procedures for a new speech corpus: purpose definition, content selection, definition of recording conditions and requirements, corpus segmentation and annotation. First we describe and comment the results of a frequency analysis of Slovenian allophone strings performed on a large Slovenian input text that has been converted to allophones. Further we present a method we designed for selection of a compact and efficient set of Slovenian sentences out of a large text corpus so as to minimize the final representative speech corpus. The selected sentences cover all the desired most frequent Slovenian quadphones, triphones and subsequently diphones. We describe the recording sessions and recording conditions. We continue describing the corpus annotation process. Finally, we describe the archive structure of the spoken corpus and present the information on its structure, content and size
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.