Prediction from early development to later achievement has the potential to improve clinical and educational service delivery as well as to inform developmental theory. In this longitudinal study, we asked how well can educational achievement measured in the final year (Grade 9, age 15) of compulsory education—both overall and for outcomes in the lowest 20%—be predicted from information available in the first 3 years of life, particularly early expressive vocabulary? Measures for 2,767 children (1,345 males, 1,422 females) aged 16 to 30 months on early expressive vocabulary, along with family socioeconomic status (parental education, occupation, and household income), other demographic information (gender, birth order, parental age, social benefits, etc.), timing and nature of early child care, and early home literacy experience, were used to predict performance on Danish Upper Secondary School Leaving Exam (USSLE) in Danish, English, Math, and Science. A cross-validated combination of Lasso (Least absolute shrinkage and selection operator) and ordinary least squares regression was the primary analysis for continuous outcomes and cross-validated Lasso and logistic regression for categorical outcomes. With respect to continuous outcome measures, the patterns of prediction varied with specific domain; R2 ranged from 9.4% to 21.4%. With respect to low USSLE performance, area under the curve statistics ranged from 64.1% to 72.2%. In all domains, early childhood expressive vocabulary made a significant unique contribution to the outcome when measured over the full range. The prediction was also significant for vocabulary to low Danish and English scores although not for Math and Science. Although the predictions were not strong enough for clinical diagnosis on their own, they demonstrate that low early vocabulary is an important and measurable risk condition that can direct early intervention and thus contribute to later educational attainment.
Read full abstract