Abstract
A version control system records changes to a file or set of files over time so that changes can be tracked and specific versions of a file can be recalled later. As such, it is an essential element of a reproducible workflow that deserves due consideration among the learning objectives of statistics courses. This article describes experiences and implementation decisions of four contributing faculty who are teaching different courses at a variety of institutions. Each of these faculty has set version control as a learning objective and successfully integrated one such system (Git) into one or more statistics courses. The various approaches described in the article span different implementation strategies to suit student background, course type, software choices, and assessment practices. By presenting a wide range of approaches to teaching Git, the article aims to serve as a resource for statistics and data science instructors teaching courses at any level within an undergraduate or graduate curriculum.
Highlights
Nolan & Temple Lang (2010) promote “version control” as a key topic for statistical analysis, when coordinating work across a team
Version control is an important foundation for reproducible workflows, be they collaborative or non-collaborative
It forms a necessary part of a reproducible workflow, and deserves due consideration among the learning objectives of statistics and data science courses
Summary
Nolan & Temple Lang (2010) promote “version control” as a key topic for statistical analysis, when coordinating work across a team. Version control is an important foundation for reproducible workflows, be they collaborative (maintaining versions of files that are being modified by teams) or non-collaborative (tracking analysis histories and providing analysis provenance). It forms a necessary part of a reproducible workflow, and deserves due consideration among the learning objectives of statistics and data science courses. We begin by discussing our motivations for identifying version control as a learning objective and provide summaries of courses taught by the four contributing faculty highlighting different implementation strategies chosen based on student audience, course type, software choices, and assessment practices. An Integrated Development Environment (IDE), i.e., a front-end, for R that offers integration with Git. (rstudio.com) A server-based version of RStudio that can be installed for free for academic use by instructors or institutions. (rstudio.com/products/rstudio-server-pro) A cloud-based version of RStudio software on servers provisioned by RStudio. (rstudio.cloud)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have