Using Machine Learning to Catch Bogus Firms

Taha Barwahwala,Shekhar Mittal,Aprajit Mahajan,Ofir Reich

doi:10.1145/3676188

Abstract

We investigate the use of a machine learning (ML) algorithm to identify fraudulent non-existent firms that are used for tax evasion. Using a rich dataset of tax returns in an Indian state over several years, we train an ML-based model to predict fraudulent firms. We then use the model predictions to carry out field inspections of firms identified as suspicious by the ML tool. We find that the ML model is accurate in both simulated and field settings in identifying non-existent firms. Withholding a randomly selected group of firms from inspection, we estimate the causal impact of ML driven inspections. Despite the strong predictive performance, our model driven inspections do not yield a significant increase in enforcement as evidenced by the cancellation of fraudulent firm registrations and tax recovery. We provide two explanations for this discrepancy based on a close analysis of the tax department’s operating protocols: overfitting to proxy-labels, and institutional friction in integrating the model into existing administrative systems. Our study serves as a cautionary tale for the application of machine learning in public policy contexts and of relying solely on test set performance as an effectiveness indicator. Field evaluations are critical in assessing the real-world impact of predictive models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Using Machine Learning to Catch Bogus Firms

Abstract

Talk to us

Similar Papers

More From: ACM Journal on Computing and Sustainable Societies

Lead the way for us

Similar Papers

Using Machine Learning to Catch Bogus Firms
Aprajit Mahajan ... Taha Barwahwala
-
Aprajit Mahajan, et. al.Aprajit Mahajan ... Taha Barwahwala
01 Jul 2024
01 Jul 2024

Hybrid meta-heuristic and machine learning algorithms for tunneling-induced settlement prediction: A comparative study
Pin Zhang ... Tommy H.T Chan
Tunnelling and Underground Space Technology | VOL. 99
Pin Zhang, et. al.Pin Zhang ... Tommy H.T Chan
20 Mar 2020
Tunnelling and Underground Space Technology | VOL. 99

Application of machine learning in predicting survival outcomes involving real-world data: a scoping review
Yinan Huang ... Rajender R Aparasu
BMC Medical Research Methodology | VOL. 23
Yinan Huang, et. al.Yinan Huang ... Rajender R Aparasu
13 Nov 2023
BMC Medical Research Methodology | VOL. 23

Machine Learning Applications in Orthopaedic Imaging.
Vincent M Wang ... Albert J Kozar
The Journal of the American Academy of Orthopaedic Surgeons | VOL. 28
Vincent M Wang, et. al.Vincent M Wang ... Albert J Kozar
15 May 2020
The Journal of the American Academy of Orthopaedic Surgeons | VOL. 28

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Using Machine Learning to Catch Bogus Firms

Abstract

Talk to us

Similar Papers

More From: ACM Journal on Computing and Sustainable Societies