Abstract

Long-term unemployment has significant societal impact and is of particular concerns for policymakers with regard to economic growth and public finances. This paper constructs advanced ensemble machine learning models to predict citizens’ risks of becoming long-term unemployed using data collected from European public authorities for employment service. The proposed model achieves 81.2% accuracy on identifying citizens with high risks of long-term unemployment. This paper also examines how to dissect black-box machine learning models by offering explanations at both a local and global level using SHAP, a state-of-the-art model-agnostic approach to explain factors that contribute to long-term unemployment. Lastly, this paper addresses an under-explored question when applying machine learning in the public domain, that is, the inherent bias in model predictions. The results show that popular models such as gradient boosted trees may produce unfair predictions against senior age groups and immigrants. Overall, this paper sheds light on the recent increasing shift for governments to adopt machine learning models to profile and prioritize employment resources to reduce the detrimental effects of long-term unemployment and improve public welfare.

Highlights

  • Long-term unemployment (LTU), by definition of OECD, refers to unemployed people of working age who are actively looking for a job but remain unemployed for a span of over 12 months

  • In 2018, European Union implemented General Data Protection Regulation (GDPR) that mandates “the data subject should have the right not to be subject to a decision, which may include a measure, evaluating personal aspects relating to him or her which is based solely on automated processing and which produces legal effects concerning him or her or significantly affects him or her.”

  • As shown in the performance summary, we can use advanced machine learning models such as gradient boosted trees (XGBoost) to predict long-term unemployment with 81.2% accuracy and this represents 10% better performance than the baseline Logistic Regression model that most public employment services (PES) currently adopt

Read more

Summary

Introduction

Long-term unemployment (LTU), by definition of OECD, refers to unemployed people of working age who are actively looking for a job but remain unemployed for a span of over 12 months. [1] Long-term unemployment causes detrimental effects to the economy, including a lower aggregate demand and a lower GDP, a loss of tax revenue to the government, and an excess cost of unemployment benefits. [2] Long-term unemployed people tend to gain lower income even when they find a new job. There are two additional challenges for governments or public organizations to adopt machine learning-based models to automate the identification process. In 2018, European Union implemented General Data Protection Regulation (GDPR) that mandates “the data subject should have the right not to be subject to a decision, which may include a measure, evaluating personal aspects relating to him or her which is based solely on automated processing and which produces legal effects concerning him or her or significantly affects him or her.” [5] the growing use of machine learning models in public organizations has stirred a debate about bias and fairness embedded in the model. An accountable, transparent, and fair model is in critical need for automated decision making in areas such as tackling long-term unemployment

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call