Abstract
This study proposes to address the economic significance of unpaid taxes by using an automatic system for predicting a tax default. Too little attention has been paid to tax default prediction in the past. Moreover, existing approaches tend to apply conventional statistical methods rather than advanced data analytic approaches, including state-of-the-art machine learning methods. Therefore, existing studies cannot effectively detect tax default information in real-world financial data because they fail to take into account the appropriate data transformations and nonlinear relationships between early-warning financial indicators and tax default behavior. To overcome these problems, this study applies diverse feature transformation techniques and state-of-the-art machine learning approaches. The proposed prediction system is validated by using a dataset showing tax defaults and non-defaults at Finnish limited liability firms. Our findings provide evidence for a major role of feature transformation, such as logarithmic and square-root transformation, in improving the performance of tax default prediction. We also show that extreme gradient boosting and the systematically developed forest of multiple decision trees outperform other machine learning methods in terms of accuracy and other classification performance measures. We show that the equity ratio, liquidity ratio, and debt-to-sales ratio are the most important indicators of tax defaults for 1-year-ahead predictions. Therefore, this study highlights the essential role of well-designed tax default prediction systems, which require a combination of feature transformation and machine learning methods. The effective implementation of an automatic tax default prediction system has important implications for tax administration and can assist administrators in achieving feasible government expenditure allocations and revenue expansions.
Highlights
World Bank statistics claim that approximately 40% of firms around the globe pay their taxes but 60% fail to pay their taxes, and these amounts might not be recovered during upcoming tax years
Nonparametric statistical tests were conducted using the Knowledge Extraction based on Evolutionary Learning (KEEL) GPLv3 modules
For the machine learning methods, the grid search procedure performed over the 10-fold cross-validation was used to find the optimal values of training parameters
Summary
World Bank statistics claim that approximately 40% of firms around the globe pay their taxes but 60% fail to pay their taxes, and these amounts might not be recovered during upcoming tax years. In the literature on credit default and corporate bankruptcy prediction, feature transformation techniques were applied to enhance the informational content of financial indicators by reducing their group-level heterogeneity [13] and distortions of financial ratio distributions [14]. Based on these considerations, the current study examines several approaches to feature transformation, which is a novel research domain in taxation and accounting fields. A real-world dataset of Finnish tax defaulted and nondefaulted firms is used to demonstrate the effectiveness of the proposed prediction system, indicating significant improvements of classification performance over existing tax default prediction models.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have