Abstract

[This paper is part of the Focused Collection on Quantitative Methods in PER: A Critical Examination.] How data are collected and how they are analyzed is typically described in the literature, but how the data are encoded is often not described in detail. In this paper, we discuss how data typically gathered in PER are encoded and how the choice of encoding plays a role in data analysis. We describe the kinds of data that are found when using short answer, multiple choice, Likert-scale, ranking task, and free response questions in terms of nominal, ordinal, interval, and ratio data. We discuss the mathematical operations that are available for each kind of data and how this affects ways that similarity and difference between student responses can be determined, a topic we discuss in terms of measures of distances and correlation. Finally, we use several papers from the literature to discuss ways in which data have been encoded and analyzed, with examples of normalized gain, factor analysis, model analysis, cluster analysis, and the investigation of epistemological agreement. We highlight both strengths and weaknesses of the data encoding approaches used in these studies. Our goal is not a comprehensive review, but one that is illustrative and can help researchers understand their own and each other’s work more deeply.Received 31 August 2018DOI:https://doi.org/10.1103/PhysRevPhysEducRes.15.020103Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.Published by the American Physical SocietyPhysics Subject Headings (PhySH)Research AreasResearch methodologyPhysics Education Research

Highlights

  • Many kinds of research in physics education research (PER) involve connecting a research question to observations of student behavior or knowledge in such a way that the information contained in the observations can be used to answer the research question

  • Tests like the Force and Motion Conceptual Evaluation (FMCE) and Force Concept Inventory (FCI) have been designed as tools with which certain patterns of student behavior or knowledge can be measured in some populations

  • We begin by describing the properties of different forms of data and how these relate to the encoding of data. We describe how these properties correspond to the information being represented, and how these properties affect the sorts of mathematical operations that are permissible on the data

Read more

Summary

INTRODUCTION

Many kinds of research in physics education research (PER) involve connecting a research question to observations of student behavior or knowledge in such a way that the information contained in the observations can be used to answer the research question. While there may be some circumstances in which the limitations created by some forms of inconsistency between the data collection, data encoding, and data analysis do not hinder a researcher’s ability to answer the research question at hand, we do not intend to treat those circumstances here Such circumstances are, by their nature, specific to the research question and present far too many possibilities for us to account for in this paper. We focus on distance measures and correlation coefficients in our discussion as these form the fundamental first step for several different analysis methods We assess how these distances or correlations treat their data and what properties they (and any analysis method based on them) are assuming the data to have. With this review paper, and its necessarily limited discussion of data encoding in a small slice of the existing PER literature, we seek to initiate a discussion of encoding processes in order to stimulate the PER community to take this procedural step more seriously and feature it more prominently in publications and presentations

THEORY OF DATA
Mathematical properties of data
Types of data
Asymmetric nominal
Symmetric nominal
Ordinal
Interval
Missing data
MATCHING DATA TYPES
Matching to questions
Multiple choice
Short answer
Ranking tasks
Likert scale
Free response
Matching to analysis methods
Minkowski distances
Simple matching coefficient
Jaccard coefficient
Pearson correlation coefficient
Spearman rank correlation
Kendall’s τ
A REVIEW OF PRIOR RESEARCH
Normalized gain
Factor analysis
Model analysis
Cluster analysis
Epistemological agreement
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call