Abstract
This article introduces a wide range of approaches to using large bodies of data for linguistic research. Corpus analysis for phonological research involves the investigation of the phonetic, phonological, and lexical properties of speech for the purpose of understanding the patterns of variation in the phonetic expression of words, and the distributional patterns of sound elements in relation to the linguistic context. A speech corpus provides a basis for investigating variability in phonetic form and also provides a rich resource for studying the relationship between phonological form and other levels of linguistic structure. Linguistic metadata provides information about the speakers, such as sex, age, ethnicity, and region of residence. Metadata may also provide information about speaker recruitment and recording procedures. Forced alignment is done using algorithms from automatic speech recognition (ASR), and is most successful when each phone associated with the word in its dictionary form is actually fully pronounced. One of the easiest methods of manipulating natural speech is the splicing technique, where parts of a speech signal are cut out, repeated, or cross-spliced with another piece of the signal. The gating technique is another form of natural speech signal manipulation often applied in psycholinguistic experiments, where parts of a speech signal are cut off, and incrementally more of the signal is presented to a listener. Another speech signal manipulation is the mixing of two signals.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.