Abstract

We discuss the design, effectiveness, and curricular impacts of an introductory data science course that uniquely meets a number of competing constraints. Student needs range from an initial exploration for future data scientists to the only computing course for some science students. Initially, associated research labs, departments, and industries provided differing perspectives on skills and preferred programming languages. The focus was on practical programming skills and learning through demos to develop useful skills quickly. The task-driven nature also enabled the mutual development of resources for each of the top three requested languages, Python, R, and MATLAB, allowing students to concurrently complete the course in any of the three languages. Tutorials in all three languages were produced to prepare students for the language-independent task-driven assignments given each day. For consistency all tutorials and assignments use the Jupyter notebook cell-based workflow. Feedback and performance measures of three student cohorts (Spring 2018, Fall 2018, and Spring 2019) are analyzed. Feedback included an average rating that met or exceeded 4 (substantial progress) out of 5 for each separate cohort on all four general educational objectives (Basic Understanding, Developing Skills, Application, and Numeric Methods). 75% of responding students indicated a preference for the multi-language course design. The course also attracted a majority of women every semester (63% female, average). Because of these benefits, this course became the recommended first programming course for a newly developed and approved undergraduate data science major. All material is online and available at https://sites.google.com/view/comp-180

Highlights

  • The amount of data generated, processed, and analyzed is increasing at an exceptional rate

  • Training scientists for our data-driven world has inherent overlap with a field of explosive growth in industry; Data science is valued across academia and industry due to it its ability to extract meaningful insights from data using a combination of domain knowledge, programming skills, and statistics [13]

  • During curricular development of a new data science undergraduate major, this course was chosen as the introductory programming course for students new to the major as it provided a beneficial foundation for future work

Read more

Summary

INTRODUCTION

The amount of data generated, processed, and analyzed is increasing at an exceptional rate. We had competing interests: psychology and engineering stakeholders suggested MATLAB, statisticians suggested R, while Python is most popular, especially for computer science [19] This was addressed by designing course materials to provide code-based instruction in all three languages, though the exercises and evaluations are language agnostic. During curricular development of a new data science undergraduate major, this course was chosen as the introductory programming course for students new to the major as it provided a beneficial foundation for future work. This approach to course development, and the data science program in which it was embedded, are further supported by observations in similar data science courses and programs. We demonstrate that this data-centric approach better engages and supports a diversity of students [9]

COURSE CONTENT
CLASSROOM APPROACH
Assignments and Projects
Assessment Strategy
COURSE EXPERIENCE AND FEEDBACK
Student Composition
Student Feedback
Student Performance
Findings
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call