Both language and music consist of discrete elements organized in embedded hierarchical structures. Schon and Francois nicely expose in this review that musical expertise facilitates learning of both linguistic and musical structures. At the behavioral level, the musicians did not outperform the non-musicians. However, ERP analyses showed that acquisition of boundary perception (segmentation) between units improved with musical training. The experimental strategy typically used to investigate segmentation relies in a learning phase-on passive exposition to artificially constructed linguistic and musical material (cf. Figure 1). The authors plausibly argue, also based on a solid literature in this field, that such perceptual learning partially relies on statistics. The probability that a certain element is followed by another is different between and within units (words or tone sequences). In the test phase, participants should discriminate units from non-units. Statistical learning is by no means restricted to the auditory domain. Why would musical expertise facilitate such learning in language? Musical and linguistic syntactical capacities seem correlated (Jentschke et al., 2008; Jentschke and Koelsch, 2009, also see the cited sources by Schon and Francois, p. 5). Moreover, brain substrates for language and music production and perception partially neighbor or overlap each other, although hemispheric dominances for music and language manifest (Zatorre, 2001; Koelsch et al., 2002; Brown et al., 2006). Schon and Francois observed similar ERP responses to linguistic and musical test-items (cf. Figure 4). Shared cerebral networks and behavioral features involved in processing of complex sound suggest common roots. As already suggested by Darwin in his book “The Descent of Man and Selection in Relation to Sex,” a precursor or “proto language” may have preceded the emergence of separated musical and linguistic human capacities, explaining the observed brain and behavior commonalities. Vocal learning capacities possibly contributed to the “survival of the fittest.” We share akin vocal learning capacities with other higher order vertebrates (birds, whales, etc.), as shown in recent comparative research (Huron, 2001; Hauser and McDermott, 2003). More precisely, not vocal discrimination as such, but learning of vocal discrimination seems innate. Now learning is synonymous with plasticity. We can become experts in very different domains, and behavior and brain adapt accordingly, comprising brain adaptations on the functional and the structural level (Maguire et al., 2000; Pascual-Leone, 2001; Brecht and Schmitz, 2008; James et al., 2008; Oechslin et al., 2009; Schlaug et al., 2009). In this context it is not surprising, as the authors also state, that experts in the musical domain show increased learning capacities for segmentation in both music and language. I would argue that trained musicians segment more efficiently not because their statistical learning is better, but because their discrimination and memory of complex sound is better, therefore allowing improved statistical learning as compared to non-musicians, also in a non-musical domain such as language. In conclusion, joint examination of music and language constitutes a powerful means to gain further insight into the processing of highly structured complex sounds in language and music, and their shared behavioral and cerebral features. Schon and Francois provide us with compelling examples of such research.