A comparison of persistence assumptions for estimating teacher effects
ABSTRACT The persistence of teacher effects is a crucial aspect in value-added models, especially when it comes to high-stakes decisions. This study compared various assumptions of persistence (generalized, variable, complete, and zero) regarding their model fit and consistency of estimates. Longitudinal data from assessments in mathematics, science, social sciences, and Turkish subjects were analyzed. Results indicated that the model with the variable persistence assumption demonstrated superior fit in three out of four subjects for teacher value-added score estimates. The correlation between the estimates for models with generalized and variable persistence assumptions was notably high. However, substantial variances were observed in estimates regarding their magnitude and corresponding rank order. The study deliberated on the implications of different persistence assumptions on teacher value-added score estimates.
- Research Article
14
- 10.1214/10-aoas405
- Jun 1, 2011
- The Annals of Applied Statistics
The increasing availability of longitudinal student achievement data has heightened interest among researchers, educators and policy makers in using these data to evaluate educational inputs, as well as for school and possibly teacher accountability. Researchers have developed elaborate "value-added models" of these longitudinal data to estimate the effects of educational inputs (e.g., teachers or schools) on student achievement while using prior achievement to adjust for nonrandom assignment of students to schools and classes. A challenge to such modeling efforts is the extensive numbers of students with incomplete records and the tendency for those students to be lower achieving. These conditions create the potential for results to be sensitive to violations of the assumption that data are missing at random, which is commonly used when estimating model parameters. The current study extends recent value-added modeling approaches for longitudinal student achievement data Lockwood et al. [J. Educ. Behav. Statist. 32 (2007) 125--150] to allow data to be missing not at random via random effects selection and pattern mixture models, and applies those methods to data from a large urban school district to estimate effects of elementary school mathematics teachers. We find that allowing the data to be missing not at random has little impact on estimated teacher effects. The robustness of estimated teacher effects to the missing data assumptions appears to result from both the relatively small impact of model specification on estimated student effects compared with the large variability in teacher effects and the downweighting of scores from students with incomplete data.
- Research Article
39
- 10.1177/016146811411600109
- Jan 1, 2014
- Teachers College Record: The Voice of Scholarship in Education
Background In the last decade, the effects of teachers on student performance (typically manifested as state-wide standardized tests) have been re-examined using statistical models that are known as value-added models. These statistical models aim to compute the unique contribution of the teachers in promoting student achievement gains from grade to grade, net of student background and prior ability. Value-added models are widely used nowadays and they are used by some states to rank teachers. These models are used to measure teacher performance or effectiveness (via student achievement gains), with the ultimate objective of rewarding or penalizing teachers. Such practices have resulted in a large amount of controversy in the education community about the role of value-added models in the process of making important decisions about teachers such as salary increases, promotion, or termination of employment. Purpose The purpose of this paper is to review the effects teachers have on student achievement, with an emphasis on value-added models. The paper also discusses whether value-added models are appropriately used as a sole indicator in evaluating teachers’ performance and making critical decisions about teachers’ futures in the profession. Research Design This is a narrative review of the literature on teacher effects that includes evidence about the stability of teacher effects using value-added models. Conclusions More comprehensive systems for teacher evaluation are needed. We need more research on value-added models and more work on evaluating value-added models. The strengths and weaknesses of these models should be clearly described. We also need much more empirical evidence with respect to the reliability and the stability of value-added measures across different states. The findings thus far do not seem robust and conclusive enough to warrant decisions about raises, tenure, or termination of employment. In other words, it is unclear that the value-added measures that inform the accountability system are adequate. It is not obvious that we are better equipped now to make such important decisions about teachers than we were 35 years ago. Good et al. have argued that we need well-thought-out and well-developed criteria that guide accountability decisions. Perhaps such criteria should be standardized across school districts and states. That would ensure that empirical evidence across different states is comparable and would help determine whether findings converge or diverge.
- Research Article
19
- 10.1007/s11092-019-09303-w
- Aug 1, 2019
- Educational Assessment, Evaluation and Accountability
Value-added (VA) modeling can be used to quantify teacher and school effectiveness by estimating the effect of pedagogical actions on students’ achievement. It is gaining increasing importance in educational evaluation, teacher accountability, and high-stakes decisions. We analyzed 370 empirical studies on VA modeling, focusing on modeling and methodological issues to identify key factors for improvement. The studies stemmed from 26 countries (68% from the USA). Most studies applied linear regression or multilevel models. Most studies (i.e., 85%) included prior achievement as a covariate, but only 2% included noncognitive predictors of achievement (e.g., personality or affective student variables). Fifty-five percent of the studies did not apply statistical adjustments (e.g., shrinkage) to increase precision in effectiveness estimates, and 88% included no model diagnostics. We conclude that research on VA modeling can be significantly enhanced regarding the inclusion of covariates, model adjustment and diagnostics, and the clarity and transparency of reporting.
- Research Article
3
- 10.1515/2151-7509.1035
- Jan 25, 2012
- Statistics, Politics, and Policy
Both research and practice of value-added models (VAM) have been growing in recent years due to the widespread effort to quantify teacher effectiveness. The existing VAM literature has not yet tested the sensitivity of value-added estimates to the rules that define which students contribute to each teacher’s value-added estimate. Student-teacher linkages are often a complex network due to various transfers, students taking multiple courses in the same subject, and students receiving special education or other “pull-out” services. Complex linkages are often considered as among the main threats to the validity of VAM. In this paper we conducted a case study to examine the sensitivity of VAM to the alternative link definitions. We examined three popular VAM approaches and applied alternative rules for linking students to teachers with each method. We found no overall sensitivity of estimated teacher effects to the linking rules. Even though more teachers had value-added estimates under more inclusive rules, for a teacher with estimates under all three rules, the correlation among pairs of estimates created using different linking rules was always above 0.95 and generally above 0.98 for each VAM approach. The value-added estimates of a small number of teachers were affected by highly different link definitions and these tended to be teachers with small numbers of students. Restricting the minimal sample size for calculating individual teachers’ value-added largely reduced the sensitivity in link definition.
- Book Chapter
1
- 10.1093/obo/9780199756810-0138
- Oct 29, 2013
Teacher evaluation has evolved over time from focusing on the moral values of a teacher in the early 1900s to standards-based evaluation models of today that seek to include measures of student academic progress. Often, teacher evaluation systems seek to serve two needs: accountability and improvement. Changes in teacher evaluation have been influenced by political winds as well as a desire to create systems that are fair and balanced. This article begins with an overview of the purposes of teacher evaluation. Next, often-cited international and US policy and research reports as well as foundational textbooks related to teacher effectiveness and teacher evaluation are highlighted. The article then provides an overview of early models of teacher evaluation focused on the roles and responsibilities of a teacher and the evolution to contemporary models of teacher evaluation with a focus on a standards-based and/or outcomes-based approach to evaluation. The next section highlights seminal works that emerged in measuring teacher effectiveness as well as value-added models to support an outcomes-based approach by including student academic progress as part of evaluation. Including student outcomes has been the topic of intense discussion as policymakers and researchers debate the validity of the use of student test scores in terms of value-added modeling and other growth models. Researchers do not agree on the stability of such models and whether they do differentiate between effective and less effective teachers. Research will continue to inform and enrich this debate and discussion. Teacher observation remains a critical part of the evaluation process and the article provides a historical overview of common practices and challenges of teacher observation. Finally, works that illuminate impacts of teacher evaluation are provided, including texts and reports related to teacher growth and development, teacher retention, and teacher compensation.
- Research Article
37
- 10.3102/1076998609346967
- Jun 1, 2010
- Journal of Educational and Behavioral Statistics
There is an increasing interest in using longitudinal measures of student achievement to estimate individual teacher effects. Current multivariate models assume each teacher has a single effect on student outcomes that persists undiminished to all future test administrations (complete persistence [CP]) or can diminish with time but remains perfectly correlated (variable persistence [VP]). However, when state assessments do not use a vertical scale or the evolution of the mix of topics present across a sequence of vertically aligned assessments changes as students advance in school, these assumptions of persistence may not be consistent with the achievement data. We develop the “generalized persistence” (GP) model, a Bayesian multivariate model for estimating teacher effects that accommodates longitudinal data that are not vertically scaled by allowing less than perfect correlation of a teacher’s effects across test administrations. We illustrate the model using mathematics assessment data.
- Research Article
24
- 10.1080/0141192022000019071
- Dec 1, 2002
- British Educational Research Journal
The purpose of this article is to examine whether it is possible to combine two previously separate objectives of baseline assessment in mathematics: the use of baseline assessment for formative reasons and for value‐added functions. A review of research on early mathematics development helps to identify the importance of formative purposes of early assessment in mathematics. The development of policy on baseline assessment in mathematics is put within the broader debate about value‐added assessment. Findings of an empirical investigation into Cypriot pupils' skills and knowledge in mathematics upon entry to primary school and at the end of year 2 are presented. Significant differences among the skills and knowledge of pupils entering primary school were identified. Cluster analysis revealed five relatively homogeneous groups of pupils entering primary school according to their different knowledge and skills in mathematics. The predictive validity of baseline assessment for pupils' attainment at the end of year 2 was satisfactory. Pupil background factors were significantly related to pupils' attainment on the baseline assessment and to their attainment at the end of year 2. However, the baseline score was the most important factor in relation to pupils' progress. Pupils with special needs (either for further support or for extended activities) made less progress than pupils who were typical for their age. Differences between schools' final results were reduced substantially when account was taken of their pupil intakes, but significant differences between schools remained. It is argued that the research findings reveal the importance of developing a model of baseline assessment in mathematics attempting to achieve two purposes: identifying what pupils entering the primary school know and what they do not know in order to trigger differentiated intervention; and establishing a basis for measuring future progress in mathematics through a value‐added analysis. Implications for using such a model to raise achievement in mathematics are discussed.
- Research Article
7
- 10.1007/s11092-022-09386-y
- May 23, 2022
- Educational Assessment, Evaluation and Accountability
There is no final consensus regarding which covariates should be used (in addition to prior achievement) when estimating value-added (VA) scores to evaluate a school’s effectiveness. Therefore, we examined the sensitivity of evaluations of schools’ effectiveness in math and language achievement to covariate selection in the applied VA model. Four covariate sets were systematically combined, including prior achievement from the same or different domain, sociodemographic and sociocultural background characteristics, and domain-specific achievement motivation. School VA scores were estimated using longitudinal data from the Luxembourg School Monitoring Programme with some 3600 students attending 153 primary schools in Grades 1 and 3. VA scores varied considerably, despite high correlations between VA scores based on the different sets of covariates (.66 < r < 1.00). The explained variance and consistency of school VA scores substantially improved when including prior math and prior language achievement in VA models for math and prior language achievement with sociodemographic and sociocultural background characteristics in VA models for language. These findings suggest that prior achievement in the same subject, the most commonly used covariate to date, may be insufficient to control for between-school differences in student intake when estimating school VA scores. We thus recommend using VA models with caution and applying VA scores for informative purposes rather than as a mean to base accountability decisions upon.
- Book Chapter
- 10.4324/9781138609877-ree22-1
- May 30, 2022
Over the past 50 years numerous empirical studies have examined the effects that teachers have on student performance. The findings about teacher effects on student performance have been mixed and inconclusive. Although there is systematic evidence that teacher effectiveness varies considerably, teacher characteristics such as education, experience, and salary are not always associated with student performance. In addition, teacher characteristics such as experience and education explain only a small part of the variability in teacher effectiveness. Over the past 15 years value-added models have been used to estimate teacher effects. This entry reviews the literature on teacher effects and focuses on the magnitude, stability, and estimation of teacher effects over time. Suggestions about future research on teacher effects are offered.
- Research Article
- 10.63544/ijss.v4i3.143
- Jul 17, 2025
- Inverge Journal of Social Sciences
This study investigates the psychological impact of corrective feedback on English as a Second Language (ESL) students' language anxiety using a quantitative research approach. Conducted among 80 intermediate-level ESL learners in Lahore and Karachi, Pakistan, the research examines how different types and frequencies of corrective feedback, explicit correction, metalinguistic feedback, recasts, clarification requests, and elicitation, affect learners’ emotional responses. Data were collected through a structured questionnaire incorporating items from the Foreign Language Classroom Anxiety Scale (FLCAS) and were analysed using descriptive statistics, independent samples t-tests, and Pearson correlation coefficients. The findings reveal that explicit correction and metalinguistic feedback are most strongly associated with elevated levels of language anxiety, while recasts result in significantly lower anxiety responses. A moderate positive correlation was also found between feedback frequency and anxiety levels, indicating that more frequent corrective input can exacerbate learners' emotional discomfort. These results highlight the need for pedagogical practices that balance effective error correction with emotional sensitivity. The study underscores the importance of using indirect feedback strategies and fostering a psychologically supportive classroom environment to enhance ESL learners' confidence and communicative engagement. References Aljasser, A. (2025). Investigating EFL students’ perceptions of feedback: A comparative study of instructor and ChatGPT-generated responses in academic writing. Education and Information Technologies, 1-22. Biju, N., Abdelrasheed, N. S. G., Bakiyeva, K., Prasad, K. D. V., & Jember, B. (2024). Which one? AI-assisted language assessment or paper format: An exploration of the impacts on foreign language anxiety, learning attitudes, motivation, and writing performance. Language Testing in Asia, 14(1), 45. Charalampous, A., & Darra, M. (2025). The effect of teacher's feedback on student academic achievement: A literature review. Journal of Education and Learning, 14(1), 42–53. Chen, H., Rasool, U., Hu, T., & Bhattacharyya, E. (2025). Examining the beliefs of non-native English-speaking teachers and EFL students about WCF in enhancing writing skills. Acta Psychologica, 256, 105064. Cheng, X., & Xu, J. (2025). Engaging second language (L2) students with synchronous written corrective feedback in technology-enhanced learning contexts: A mixed-methods study. Humanities and Social Sciences Communications, 12(1), 1–13. Darazi, M. A., Khoso, A. K., & Mahesar, K. A. (2023). Investigating the effects of ESL teachers’ feedback on ESL undergraduate students’ level of motivation, academic performance, and satisfaction: Mediating role of students’ motivation. Pakistan Journal of Educational Research, 6(2). Ebadijalal, M., & Yousofi, N. (2023). The impact of mobile-assisted peer feedback on EFL learners’ speaking performance and anxiety: Does language make a difference? The Language Learning Journal, 51(1), 112–130. Gregersen, T. (2023). Feedback matters: Thwarting the negative impact of language anxiety. Annual Review of Applied Linguistics, 43, 56–63. Hajiyeva, B. (2024). Language anxiety in ESL learners: Causes, effects, and mitigation strategies. EuroGlobal Journal of Linguistics and Language Education, 1(1), 119–133. Liu, C. C., Hwang, G. J., Yu, P., Tu, Y. F., & Wang, Y. (2025). Effects of an automated corrective feedback-based peer assessment approach on students’ learning achievement, motivation, and self-regulated learning conceptions in foreign language pronunciation. Educational Technology Research and Development, 1-22. Mao, Z., Lee, I., & Li, S. (2024). Written corrective feedback in second language writing: A synthesis of naturalistic classroom studies. Language Teaching, 1-29. Patra, I., Alazemi, A., Al-Jamal, D., & Gheisari, A. (2022). The effectiveness of teachers’ written and verbal corrective feedback (CF) during formative assessment (FA) on male language learners’ academic anxiety (AA), academic performance (AP), and attitude toward learning (ATL). Language Testing in Asia, 12(1), 19. Rassaei, E. (2023). The interplay between corrective feedback timing and foreign language anxiety in L2 development. Language Teaching Research, 13621688231195141. Sari, E., & Han, T. (2024). The impact of automated writing evaluation on English as a foreign language learners' writing self-efficacy, self-regulation, anxiety, and performance. Journal of Computer Assisted Learning, 40(5), 2065–2080. Shahid, N., Asif, M., & Pasha, D. A. (2022). Effect of Internet Addiction on School Going Children. Inverge Journal of Social Sciences, 1(1), 13–55. https://doi.org/10.63544/ijss.v1i1.3 Wang, D. (2024). Teacher-versus AI-generated (Poe application) corrective feedback and language learners’ writing anxiety, complexity, fluency, and accuracy. International Review of Research in Open and Distributed Learning, 25(3), 37–56. Wang, J., Zhou, T., & Fan, C. (2025). Impact of communication anxiety on L2 WTC of middle school students: Mediating effects of growth language mindset and language learning motivation. PLoS ONE, 20(1), e0304750. Watcharapol, W., Phornrat, T., Teavakorn, K., Tidarat, N., Kanokpan, W., Somkiat, K., & Nattawut, J. (2023). Preferences for oral corrective feedback: Are language proficiency, first language, foreign language classroom anxiety, and enjoyment involved? Journal of Language and Education, 9(1), 172–184. Xiong, Y., Zhang, Q., Zhao, L., Liu, S., Guan, H., Sui, Y., ... & Lee, K. M. R. (2024). A meta-analysis and systematic review of foreign language anxiety interventions among students. Journal of Language and Social Psychology, 43(5-6), 620–650.
- Book Chapter
- 10.1007/978-3-319-19977-1_11
- Jan 1, 2015
It seems that the application of value added modeling (VAM) in educational settings has been gaining momentum in the past decade or so due to the interest in using test scores to evaluate teachers or schools, and currently myriads of VAM models are available for VAM researchers and practitioners. Despite the large number of VAM models, McCaffrey et al. (2004) summarized the relations among them and concluded that many can be viewed as special cases of persistence models. In persistence models, student scores are calculated based on the sum of teacher effects across years. Since different students may change teachers every year and have different membership in multiple group units, such models are also referred to as “multiple membership” models (Browne et al. 2001 Rasbash and Browne 2001). Persistence models differ from each other in the value of the persistence parameter, which, ranging from 0 to 1, denotes how teacher effects at the current year persist into the subsequent years, may it be vanished, undiminished, or diminished. The Variable Persistence (VP) model (Lockwood et al. 2007 McCaffrey et al. 2004) had been considered more flexible due to its free estimation of the persistence parameter, while other persistence models constrain its value to be either 0 or 1.
- Research Article
453
- 10.3102/10769986029001067
- Mar 1, 2004
- Journal of Educational and Behavioral Statistics
The use of complex value-added models that attempt to isolate the contributions of teachers or schools to student development is increasing. Several variations on these models are being applied in the research literature, and policy makers have expressed interest in using these models for evaluating teachers and schools. In this article, we present a general multivariate, longitudinal mixed-model that incorporates the complex grouping structures inherent to longitudinal student data linked to teachers. We summarize the principal existing modeling approaches, show how these approaches are special cases of the proposed model, and discuss possible extensions to model more complex data structures. We present simulation and analytical results that clarify the interplay between estimated teacher effects and repeated outcomes on students over time. We also explore the potential impact of model misspecifications, including missing student covariates and assumptions about the accumulation of teacher effects over time, on key inferences made from the models. We conclude that mixed models that account for student correlation over time are reasonably robust to such misspecifications when all the schools in the sample serve similar student populations. However, student characteristics are likely to confound estimated teacher effects when schools serve distinctly different populations.
- Book Chapter
- 10.1016/b978-0-08-044894-7.01374-9
- Jan 1, 2010
- International Encyclopedia of Education
Value-Added Models
- Research Article
164
- 10.1162/edfp_a_00027
- Jan 1, 2011
- Education Finance and Policy
Value-added modeling continues to gain traction as a tool for measuring teacher performance. However, recent research questions the validity of the value-added approach by showing that it does not mitigate student-teacher sorting bias (its presumed primary benefit). Our study explores this critique in more detail. Although we find that estimated teacher effects from some value-added models are severely biased, we also show that a sufficiently complex value-added model that evaluates teachers over multiple years reduces the sorting bias problem to statistical insignificance. One implication of our findings is that data from the first year or two of classroom teaching for novice teachers may be insufficient to make reliable judgments about quality. Overall, our results suggest that in some cases value-added modeling will continue to provide useful information about the effectiveness of educational inputs.
- Discussion
2
- 10.1080/2330443x.2014.955228
- Nov 7, 2014
- Statistics and Public Policy
The position statement on value-added models published by the American Statistical Association (ASA) suggested guidelines and issues to consider when using value-added models as a component of a teacher evaluation system. One suggestion offered is that value-added results should be accompanied by measures of precision. It is important, however, to go beyond simply reporting measures of precision alongside point estimates, but instead to formally include measures of precision into the value-added models and teacher classification systems. This practice will lead to improved inferences and reduced misclassification rates. Therefore, the aim of this article is to offer two suggested approaches for including measures of precision into the value-added model process. The first suggestion is to account for measurement error in student test scores and is motivated by the claim that measurement error is of little concern when the model conditions on at least three test scores. The second suggestion is to directly incorporate standard errors of the point estimates when forming overall classifications regarding teacher effects. This is intended to demonstrate ways in which teacher misclassification rates can be minimized.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.