Abstract

In the first portion of this paper, we utilize millions of loan-level servicing records for mortgages originated between 2004 and 2016 to study the performance of predictive models of mortgage default. We find that the logistic regression model -- the traditional workhorse for consumer credit modeling -- as well as machine learning methods can be very inaccurate when used to predict loan performance in out-of-time samples. Importantly, we find that this model failure was not unique to the early-2000s housing boom. We use the Panel Study of Income Dynamics in the second part of our paper to provide evidence that this model failure can be attributed to intertemporal heterogeneity in the relationship between variables that are frequently used to predict mortgage performance and the realized post-origination path of variables that have been shown to trigger mortgage default. Our findings imply that model instability is a significant source of risk for lenders, such as financial technology firms (Fintechs), that rely heavily on predictive statistical models and machine learning algorithms for underwriting and account management.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.