Abstract

Data science is a discipline that provides principles, methodology, and guidelines for the analysis of data for tools, values, or insights. Driven by a huge workforce demand, many academic institutions have started to offer degrees in data science, with many at the graduate, and a few at the undergraduate level. Curricula may differ at different institutions, because of varying levels of faculty expertise, and different disciplines (such as mathematics, computer science, and business) in developing the curriculum. The University of Massachusetts Dartmouth started offering degree programs in data science from Fall 2015, at both the undergraduate and the graduate level. Quite a few articles have been published that deal with graduate data science courses, much less so dealing with undergraduate ones. Our discussion will focus on undergraduate course structure and function, and specifically, a first course in data science. Our design of this course centers around a concept called the data science life cycle. That is, we view tasks or steps in the practice of data science as forming a process, consisting of states that indicate how it comes into life, how different tasks in data science depend on or interact with others until the birth of a data product or a conclusion. Naturally, different pieces of the data science life cycle then form individual parts of the course. Details of each piece are filled up by concepts, techniques, or skills that are popular in industry. Consequently, the design of our course is both “principled” and practical. A significant feature of our course philosophy is that, in line with activity theory, the course is based on the use of tools to transform real data to answer strongly motivated questions related to the data.

Highlights

  • We discuss our implementation of a first-year undergraduate course in data science as part of a 4-year university-level BS in data science, and we elaborate what we see as important principles for any beginning undergraduate course in data science

  • We focus in the beginning, mainly on exploratory data analysis, and concepts related to various parts in the data science life cycle

  • We have briefly introduced a first course in data science offered at the University of Massachusetts Dartmouth since Fall 2015

Read more

Summary

Introduction

We discuss our implementation of a first-year undergraduate course in data science as part of a 4-year university-level BS in data science, and we elaborate what we see as important principles for any beginning undergraduate course in data science. Our principal aim is to stimulate discussion on relevant principles and criteria for a productive introduction to data science

Background on data science
Asking interesting questions
Details of R programming
Sampling and data collection
Exploratory data analysis
Descriptive and summary statistics
Graphics and data visualization
Data transformation and feature engineering
Clustering
Simple modeling with linear regression
Confirmatory data analysis and hypothesis testing
Difference from a statistics course at similar level
Labs and course projects
Assessment
Conclusion
Findings
10.1 A sample weekly schedule of DSC101
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call