Chunking or Not Chunking? How Do We Find Words in Artificial Language Learning? Ana Franco (afranco@ulb.ac.be) Arnaud Destrebecqz (adestre@ulb.ac.be) Cognition, Consciousness and Computation Group Universite libre de Bruxelles, 50 ave. F.-D. Roosevelt, B1050 BELGIUM Abstract What is the nature of the representations acquired in implicit statistical learning? Recent results in the field of language learning have shown that adults and infants are able to find the words of an artificial language when exposed to a continuous auditory sequence consisting in a random ordering of these words. Such performance can only be based on processing the transitional probabilities between sequence elements. Two different kinds of mechanisms may account for these data: Participants either parse the sequence into smaller chunks corresponding to the words of the artificial language, or they become progressively sensitive to the actual values of the transitional probabilities. The two accounts are difficult to differentiate because they tend to make similar predictions in similar experimental settings. In this study, we present two experiments aimed at disentangling these two theories. In these experiments, participants had to learn two sets of pseudo-linguistic regularities (L1 and L2) presented in the context of a Serial Reaction Time (SRT) task. L1 and L2 were either unrelated, or the intra- words transitions of L1 became the inter-words transitions of L2. The two models make opposite predictions in these two situations. Our results indicate that the nature of the representations depends on the learning conditions. When cues are presented to facilitate parsing of the sequence, participants learned the words of the artificial language. However, when no cues were provided, their performance was strongly influenced by the actual values of the transitional probabilities. Keywords: implicit statistical learning; SRN; chunking; serial reaction time task Introduction A central issue in implicit learning research concerns the nature of the acquired knowledge. Does it reflect the abstract rules on which the training material is based or the surface features of the material, such as the frequencies of individual elements or chunks? According to some theorists, cognition can be viewed as rule-based symbol manipulation (Pinker & Price, 1988). From this perspective, learning would consist in the formation of new abstract, algebra-like rules. According to another theoretical position, information processing is essentially based on associative processes. In this view, learning would not depend on rule acquisition but on mechanisms capable of extracting the statistical regularities present in the environment (e.g., Elman, 1990). Over the last few years, a series of experimental results have provided new insights into the question of the nature of the representations involved in implicit learning. Research on language acquisition has shown that 8-months old infants are sensitive to statistical information (Jusczyk et al., 1994; Saffran, Aslin, & Newport, 1996; Saffran, Johnson, Aslin, & Newport, 1999) and capable of learning distributional relationships between linguistic units (Gomez & Gerken, 1999; Jusczyk, Houston, & Newsome, 1999; Saffran, Aslin, & Newport, 1996; Perruchet & Desaulty, 2008) presented in the continuous speech stream formed by an artificial language. Other studies have indicated that adults are also capable of extracting statistical regularities, and that these mechanisms are not restricted to linguistic material but also apply to auditory non-linguistic stimuli (Saffran, Johnson, Aslin, & Newport, 1999) or to visual stimuli (Fiser & Aslin, 2002). In the same way, implicit sequence learning studies have indicated that human learners are good at detecting the statistical regularities present in a serial reaction time (SRT) task. Altogether, these data suggest that statistical learning depends on associative learning mechanisms rather than on the existence of a “rule abstractor device” (Perruchet, Tyler, Galland, & Peereman, 2004). However, different models have been proposed to account for the data. According to the Simple Recurrent Network model (Elman, 1990; Cleeremans, & McClelland, 1991; Cleeremans, 1993), learning is based on the development of associations between the temporal context in which the successive elements occur and possible successors. Over training, the network learns to provide the best prediction of the next target in a given context, based on the transitional probabilities between the different sequence elements. On the other hand, chunking models, such as PARSER, consider learning as an attention-based parsing process that results in the formation of distinctive, unitary, rigid representations or chunks (Perruchet & Vinter, 1998). Thus, both models are based on processing statistical regularities, but only PARSER leads to the formation of “word-like” units. Although the representations assumed by these two classes of models are quite different, contrasting their assumptions is made difficult by the fact that they tend to make similar experimental predictions. For instance, in a typical artificial language learning experiment, participants are exposed to a continuous stream of plurisyllabic non-words (e.g., BATUBI, DUTABA…) presented in a random order, such that transitional probabilities between syllables are stronger intra-word
Read full abstract