Abstract
Machine learning has dramatically expanded the range of tools for evaluating economic panel data. This paper applies a variety of machine-learning methods to the Boston housing dataset, an iconic proving ground for machine learning. Though machine learning often lacks the overt interpretability of linear regression, methods based on decision trees score the relative importance of dataset features. In addition to addressing the theoretical tradeoff between bias and variance, this paper discusses practices rarely followed in traditional economics: the splitting of data into training, validation, and test sets; the scaling of data; and the preference for retaining all data. The choice between traditional and machine-learning methods hinges on practical rather than mathematical considerations. In settings emphasizing interpretative clarity through the scale and sign of regression coefficients, machine learning may best play an ancillary role. Wherever predictive accuracy is paramount, however, or where heteroskedasticity or high dimensionality might impair the clarity of linear methods, machine learning can deliver superior results.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.