Optimizing tax revenues is difficult in Indonesia due to obstacles such as tax evasion and tax avoidance. It is closely related to an organization's compliance with tax regulations, known as the taxpayers risk profile. However, this mechanism does not accurately detect tax avoidance and tax evasion risks. To overcome this limitation, we use a multilabel classification machine learning method in this study, which classifies a single observation into one or more labels at once. The approach involves problem transformation (binary relevance and label powerset), algorithm adaptation (multilabel k-nearest neighbor (ML-kNN) and multilabel-adaptive resonance associative map (ML ARAM)), and ensemble (label space partitioning and random k-label sets with disjoint (RAkELd)). Based on the model performance comparisons, we discovered that the ML-ARAM method based on deep learning is the best, with an average F1-score of 95.5% and a hamming loss of 7.4%. We also examine the feature importance of the best model to reduce the dimensions of features so that we can identify the dominant factors that encourage a taxpayer entity to engage in tax avoidance or tax evasion. The findings of this study improve the accuracy of tax avoidance risk detection and tax evasion risk profiles using machine learning methods, ensuring maximum tax revenues in Indonesia.
Read full abstract