Abstract

Recovering discrete words from continuous speech is one of the first challenges facing language learners. Infants and adults can make use of the statistical structure of utterances to learn the forms of words from unsegmented input, suggesting that this ability may be useful for bootstrapping language-specific cues to segmentation. It is unknown, however, whether performance shown in small-scale laboratory demonstrations of “statistical learning” can scale up to allow learning of the lexicons of natural languages, which are orders of magnitude larger. Artificial language experiments with adults can be used to test whether the mechanisms of statistical learning are in principle scalable to larger lexicons. We report data from a large-scale learning experiment that demonstrates that adults can learn words from unsegmented input in much larger languages than previously documented and that they retain the words they learn for years. These results suggest that statistical word segmentation could be scalable to the challenges of lexical acquisition in natural language learning.

Highlights

  • Spoken speech is a continuous acoustic waveform without consistent breaks at the boundaries between words

  • A variety of computational systems are able to recover word boundaries with relative accuracy from an unsegmented corpus [3,4], and laboratory experiments show that–at least under certain conditions–human learners can do the same thing. These experimental demonstrations have used artificial languages with no prosody to show that both infants and adults are able to use the distribution of sound sequences to extract words from continuous speech [5,6]

  • The goal of the current study is to address this concern about the scalability of statistical learning

Read more

Summary

Introduction

Spoken speech is a continuous acoustic waveform without consistent breaks at the boundaries between words. A variety of computational systems are able to recover word boundaries with relative accuracy from an unsegmented corpus [3,4], and laboratory experiments show that–at least under certain conditions–human learners can do the same thing. These experimental demonstrations (often referred to as ‘‘statistical learning’’ experiments) have used artificial languages with no prosody to show that both infants and adults are able to use the distribution of sound sequences to extract words from continuous speech [5,6]. Infants in this type of experiment can even distinguish between strings that are matched for overall frequency but vary in their statistical coherence on measures like transitional probability (the probability of one syllable given the observation of another) [7]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.