BalancerGNN: Balancer Graph Neural Networks for imbalanced datasets: A case study on fraud detection

Mallika Boyapati,Ramazan Aygun

doi:10.1016/j.neunet.2024.106926

Abstract

Fraud detection for imbalanced datasets is challenging due to machine learning models inclination to learn the majority class. Imbalance in fraud detection datasets affects how graphs are built, an important step in many Graph Neural Networks (GNNs). In this paper, we introduce our BalancerGNN framework to tackle with imbalanced datasets and show its effectiveness on fraud detection. Our framework has three major components: (i) node construction with feature representations, (ii) graph construction using balanced neighbor sampling, and (iii) GNN training using balanced training batches leveraging a custom loss function with multiple components. For node construction, we have introduced (i) Graph-based Variable Clustering (GVC) to optimize feature selection and remove redundancies by analyzing multi-collinearity and (ii) Encoder-Decoder based Dimensionality Reduction (EDDR) using transformer-based techniques to reduce feature dimensions while keeping important information intact about textual embeddings. Our experiments on Medicare, Equifax, IEEE, and auto insurance fraud datasets highlight the importance of node construction with features representations. BalancerGNN trained with balanced batches consistently outperforms other methods, showing strong abilities in identifying fraud cases, with sensitivity rates ranging from 72.87% to 81.23% across datasets while balancing specificity. Additionally, BalancerGNN achieves impressive accuracy rates, ranging from 73.99% to 94.28%. These outcomes underscore the crucial role of graph representation and neighbor sampling techniques in optimizing BalancerGNN for fraud detection models in real-world applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

BalancerGNN: Balancer Graph Neural Networks for imbalanced datasets: A case study on fraud detection

Abstract

Talk to us

Similar Papers

More From: Neural Networks

Lead the way for us

Similar Papers

SCN_GNN: A GNN-based fraud detection algorithm combining strong node and graph topology information
Jing Chen ... Yuxuan Wang
Expert Systems with Applications | VOL. 237
Jing Chen, et. al.Jing Chen ... Yuxuan Wang
22 Sep 2023
Expert Systems with Applications | VOL. 237

Learning Graph Neural Networks with Deep Graph Library
Da Zheng ... Zheng Zhang
-
Da Zheng, et. al.Da Zheng ... Zheng Zhang
20 Apr 2020
20 Apr 2020

Medicare fraud detection using graph neural networks
Yeeun Yoo ... Donghwa Shin
-
Yeeun Yoo, et. al.Yeeun Yoo ... Donghwa Shin
20 Jul 2022
20 Jul 2022

Evaluating Model Predictive Performance: A Medicare Fraud Detection Case Study
Richard A Bauder ... Matthew Herland
-
Richard A Bauder, et. al.Richard A Bauder ... Matthew Herland
01 Jul 2019
01 Jul 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

BalancerGNN: Balancer Graph Neural Networks for imbalanced datasets: A case study on fraud detection

Abstract

Talk to us

Similar Papers

More From: Neural Networks