The surface F0 patterns of lexical tones contain variability from many different sources, which raises questions as to how infants can learn tones based on such highly variable input. A recent modeling study showed that the velocity profiles of F0 (i.e., first derivatives, or D1) enable naive learners to successfully categorize the four Mandarin tones despite cross-speaker and contextual variations [Gauthier, etal. ‘‘Recognising tones by tracking moments—How infants may develop tonal catagories from adult speech input,’’ in Proceedings ISCA workshop on plasticity in Speech Perception, edited by V. Dellwo (UCL Publications, London, UK, 2005) pp. 72–75]. The present study explores the robustness of D1 for categorizing Mandarin tones in sentences said with different focus conditions. As a contrastive communicative function, narrow focus within a multi-word utterance introduces extensive variability to F0, making tone learning a more challenging task. Using multi-speaker productions of utterances with systematically varied tones and focus [Xu, ‘‘Effects of tone and focus on the formation and alignment of contours. J. Phonet. 27, 55–105 (1999)], self-organizing neural networks were trained with both syllable-sized D1 and F0 as input, with no special treatment for different focus conditions. In the testing phase, novel tokens were tested for tonal categorization. The results revealed that D1 yielded overall excellent categorization, far superior than F0. Detailed analyses showed that with D1, performance of tonal recognition dropped only in post-focus regions. These findings indicate that successful tone learning can be achieved with velocity profiles as input despite variability introduced by focus.