The use of generative adversarial networks for multi-site one-class follicular lymphoma classification

Upeka Vianthi Somaratne,Kok Wai Wong,Jeremy Parry,Hamid Laga

doi:10.1007/s00521-023-08810-8

Upeka Vianthi Somaratne, Kok Wai Wong + Show 2 more

Open Access

https://doi.org/10.1007/s00521-023-08810-8

Copy DOI

Abstract

Recent advances in digital technologies have lowered the costs and improved the quality of digital pathology Whole Slide Images (WSI), opening the door to apply Machine Learning (ML) techniques to assist in cancer diagnosis. ML, including Deep Learning (DL), has produced impressive results in diverse image classification tasks in pathology, such as predicting clinical outcomes in lung cancer and inferring regional gene expression signatures. Despite these promising results, the uptake of ML as a common diagnostic tool in pathology remains limited. A major obstacle is the insufficient labelled data for training neural networks and other classifiers, especially for new sites where models have not been established yet. Recently, image synthesis from small, labelled datasets using Generative Adversarial Networks (GAN) has been used successfully to create high-performing classification models. Considering the domain shift and complexity in annotating data, we investigated an approach based on GAN that minimized the differences in WSI between large public data archive sites and a much smaller data archives at the new sites. The proposed approach allows the tuning of a deep learning classification model for the class of interest to be improved using a small training set available at the new sites. This paper utilizes GAN with the one-class classification concept to model the class of interest data. This approach minimizes the need for large amounts of labelled data from the new site to train the network. The GAN generates synthesized one-class WSI images to jointly train the classifier with WSIs available from the new sites. We tested the proposed approach for follicular lymphoma data of a new site by utilizing the data archives from different sites. The synthetic images for the one-class data generated from the data obtained from different sites with minimum amount of data from the new site have resulted in a significant improvement of 15% for the Area Under the curve (AUC) for the new site that we want to establish a new follicular lymphoma classifier. The test results have shown that the classifier can perform well without the need to obtain more training data from the test site, by utilizing GAN to generate the synthetic data from all existing data in the archives from all the sites.

Full Text