Leveraging asynchronous federated learning to predict customers financial distress

Ahmed Imteaj,M Hadi Amini

doi:10.1016/j.iswa.2022.200064

Ahmed Imteaj, M Hadi Amini

Open Access

https://doi.org/10.1016/j.iswa.2022.200064

Copy DOI

Abstract

• We propose an FL model to predict a client’s (or loan requester’s) financial situation by considering variant local epochs for the data holders of the clients (e.g., banks, financial organizations). • We leverage FL strategy that consider customer’s local resources to assign computational task during training. Particularly, the local computational tasks of each FL client is assigned based on their data volume, bandwidth, and network availability. • We analyze our prediction model by considering various batch sizes and client numbers for the training phase. • To the end, we visualize the performance of our FL model comparing with a centralized model, and also with a mean local model, and the best local model in an FL process. In recent years, as economic stability is shaking, and the unemployment rate is growing high due to the COVID-19 effect, assigning credit scoring by predicting consumers’ financial conditions has become more crucial. The conventional machine learning (ML) and deep learning approaches need to share customer’s sensitive information with an external credit bureau to generate a prediction model that opens up the door of privacy leakage. A recently invented privacy-preserving distributed ML scheme referred to as Federated learning (FL) enables generating a target model without sharing local information through on-device model training on edge resources. In this paper, we propose an FL-based application to predict customers’ financial issues by constructing a global learning model that is evolved based on the local models of the distributed agents. The local models are generated by the network agents using their on-device data and local resources. We used the FL concept because the learning strategy does not require sharing any data with the server or any other agent that ensures the preservation of customers’ sensitive data. To that end, we enable partial works from the weak agents that eliminate the issue if the model convergence is retarded due to straggler agents. We also leverage asynchronous FL that cut off the extra waiting time during global model generation. We simulated the performance of our FL model considering a popular dataset, Give me Some Credit (Freshcorn, 2017). We evaluated our proposed method considering a a different number of stragglers and setting up various computational tasks (e.g., local epoch, batch size), and simulated the training loss and testing accuracy of the prediction model. Finally, we compared the F1-score of our proposed model with the existing centralized and decentralized approaches. Our results show that our proposed model achieves an almost identical F1-score as like centralized model even when we set up a skew-level of more than 80 % and outperforms the state-of-the-art FL models by obtaining an average of 5 ∼ 6 % higher accuracy when we have resource-constrained agents within a learning environment.

Full Text