Learning to speak requires coordination between auditory and motor systems. Motor programs need to be fine-tuned based on goals and feedback. The motor-tuning process includes establishing the sensory-to-motor transformation (generate motor codes from articulation) and motor-to-sensory transformation (generate predictions for speech control). In this study, we investigated the relations among the three factors (feedback, goals, and predictions) in speech production in the context of second language learning. Adult Mandarin speakers were asked to learn non-native vowels: /o/ and /ɵ/—the former being less similar than the latter to Mandarin vowels, in feedback available or feedback masked conditions. We found no improvement in learning when feedback was masked, suggesting that motor-based prediction could not directly compare with goals for adult second language acquisition. Furthermore, feedback helped to learn only when target sounds are distinct from existing sounds in one’s native speech (competition between prediction and goals is minimal). The results suggest prediction and goals may share a similar representational format, which could yield a competing relation in speech learning. The feedback can conditionally overcome such interference between prediction and goals. The study further probed the functional relations among key components (prediction, goals, and feedback) of sensorimotor integration in speech production and learning.Learning to speak requires coordination between auditory and motor systems. Motor programs need to be fine-tuned based on goals and feedback. The motor-tuning process includes establishing the sensory-to-motor transformation (generate motor codes from articulation) and motor-to-sensory transformation (generate predictions for speech control). In this study, we investigated the relations among the three factors (feedback, goals, and predictions) in speech production in the context of second language learning. Adult Mandarin speakers were asked to learn non-native vowels: /o/ and /ɵ/—the former being less similar than the latter to Mandarin vowels, in feedback available or feedback masked conditions. We found no improvement in learning when feedback was masked, suggesting that motor-based prediction could not directly compare with goals for adult second language acquisition. Furthermore, feedback helped to learn only when target sounds are distinct from existing sounds in one’s native speech (competition be...