Abstract

Physics education researchers (PER) often analyze student data with single-level regression models (e.g., linear and logistic regression). However, education datasets can have hierarchical structures, such as students nested within courses, that single-level models fail to account for. The improper use of single-level models to analyze hierarchical datasets can lead to biased findings. Hierarchical models (a.k.a., multi-level models) account for this hierarchical nested structure in the data. In this publication, we outline the theoretical differences between how single-level and multi-level models handle hierarchical datasets. We then present analysis of a dataset from 112 introductory physics courses using both multiple linear regression and hierarchical linear modeling to illustrate the potential impact of using an inappropriate analytical method on PER findings and implications. Research can leverage multi-institutional datasets to improve the field's understanding of how to support student success in physics. There is no post hoc fix, however, if researchers use inappropriate single-level models to analyze multi-level datasets. To continue developing reliable and generalizable knowledge, PER should use hierarchical models when analyzing hierarchical datasets. The supplemental materials include a sample dataset, R code to model the building and analysis presented in the paper, and an HTML output from the R code.

Highlights

  • Work in physics education research (PER) focused on student conceptual change and it remains a popular area of study [1]

  • The historical failure to account for the hierarchical structure of the data in many PER studies calls into question the validity and reliability of their claims

  • hierarchical linear models (HLM) has been in use since the mid 1980s [58] and commercial software designed to perform HLM has been available since the 1990s [59]

Read more

Summary

INTRODUCTION

Work in physics education research (PER) focused on student conceptual change and it remains a popular area of study [1]. Data from multiple contexts introduces a hierarchical structure into the data where student data (level 1) nests within course data (level 2). This nesting can include additional levels, such as departments (level 3) and institutions (level 4). The assumption of independence, which is central to many statistical analyses [e.g., multiple linear regression (MLR) and analysis of variance (ANOVA)], is violated by connections between data points within a hierarchical dataset. The purpose of this article is to assist researchers in identifying and applying the regression analysis techniques best suited to their data and research questions We will accomplish this purpose in three sections: (i) Motivation— we review PER’s historical use of regression analyses, 2469-9896=19=15(2)=020108(13). (ii) theory—we discuss the theoretical advantages and disadvantages of three common techniques for dealing with hierarchical data, and (iii) application—we examine the practical implications of using MLR vs HLM

MOTIVATION
Sampling
Modeling
Disaggregation
Aggregation
Hierarchical linear modeling
HLM assumptions
APPLICATION
Data collection and preparation
Centering
Assumption checking
Descriptive statistics
Findings
Discussion
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call