Abstract

As part of the Polyphone project, Texas Instruments is in the process of collecting and developing a corpus of telephone speech in American Spanish. The corpus, called Voice Across Hispanic America (VAHA), will attempt to provide balanced phonetic coverage of the language, in addition to containing widely used vocabulary items such as digits, letter strings, yes/no responses, proper names, and selected command words and phrases used in automated telephone service applications. The speakers are native speakers of Spanish living in the United States. The collection and development of the corpus is expected to be completed by June 1995. So far, the authors have collected about 500 speakers from various parts of the U.S. They describe the design issues in various aspects of the project, such as subject recruitment, corpus and prompt sheet design, the data acquisition system, and validation and transcription. They conclude with a brief statistical profile of the data collected.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.