As vast amounts of data have become available in business in recent years, the demand for data scientists has been rising. The author of this article provides a tutorial on how one entry-level machine learning competition from Kaggle, an online community for data scientists, can be integrated into an undergraduate econometrics course as an engaging activity using only linear regression. Other techniques in this tutorial include log-linear and quadratic models and interactions of explanatory variables, which are common functional forms in econometrics. The competition allows students to use real-world data, build a predictive model, submit their model online to be evaluated instantaneously based on accuracy, and keep improving their model. R and Python codes are provided to make it possible for readers to replicate.
Read full abstract