Abstract

AbstractProviding a rich context has become a sine qua non of principled teaching of applied statistical thinking. With increasing opportunities to access secondary data, there should be increasing opportunity to work with rich context. We review the contextual information provided in 41 data sets suitable for introductory tertiary statistics teaching, available in the R “datasets” package, and investigate the source information for four data sets. We find failure to describe and retain important contextual information, including aspects that raise questions about the credibility of the data for statistical inference. The sanitization of data reduces the opportunities for learning meaningful lessons in statistical thinking and the real‐world application of statistics. We advocate for teachers and users of such data to be curious about the provenance and context, and for the curators and distributors to examine, where possible, the primary sources, to accurately preserve the context and optimize pedagogical opportunities.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call