Abstract

This paper proposes a decision support system to predict corporate tax arrears by using tax arrears in the preceding 12 months. Despite the economic importance of ensuring tax compliance, studies on predicting corporate tax arrears have so far been scarce and with modest accuracies. Four machine learning methods (decision tree, random forest, k-nearest neighbors and multilayer perceptron) were used for building models with monthly tax arrears and different variables constructed from them. Data consisted of tax arrears of all Estonian SMEs from 2011 to 2018, totaling over two million firm-month observations. The best performing decision support system, yielding 95.3% accuracy, was a hybrid based on the random forest method for observations with previous tax arrears in at least two months and a logical rule for the rest of the observations.

Highlights

  • Learning from mostly zero-valued data is a difficult task for machine learning models

  • The best-performing machine learning model was random forest trained on monthly tax arrears with aggregation of earlier periods into period means (M5_RF), whose prediction accuracy was 84.46%

  • The aim of this paper was to explore which machine learning methods and types of independent variables are most useful in predicting companies to have tax arrears in the month, given the time series of their tax arrears in the preceding 12 months

Read more

Summary

Introduction

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.