Tax evasion is an illegal activity in which individuals or entities avoid paying their true tax liability. It has always been a crucial issue for both governments and academic researchers to efficiently detect tax evasion. Recent research has proposed the use of machine learning technology to detect tax evasion and has shown good results in some specific areas. Regrettably, there are still two major obstacles to detect tax evasion. First, it is hard to extract powerful features because of the complexity of tax data. Second, due to the complicated process of tax auditing, labeled data are limited. Such obstacles motivate the contributions of this work. In this paper, we propose a novel tax evasion detection framework named FBNE-PU, a multi-stage method to detect tax evasion in real-life scenarios. In this paper, we perform an in-depth analysis of the characteristics of the transaction network and propose a novel network embedding algorithm, the PnCGCN. It significantly improves detection performance by extracting powerful features from basic features and the tax-related transaction network. Moreover, we utilize nnPU to assign pseudo labels for unlabeled data. Finally, a MLP is trained as the decision function. Experiments on three real-world datasets demonstrate that our method significantly outperforms the comparison methods in the tax evasion detection task.
Read full abstract