Dataset-Agnostic Vessel Segmentation of Retinal Fundus Images by a Vector Quantized Variational Autoencoder

Tejas Prabhune,David Walz

doi:10.47611/jsrhs.v10i3.2280

Abstract

The use of retinal fundus images plays a major role in the diagnosis of various diseases such as diabetic retinopathy. Doctors frequently perform vessel segmentation as a key step for retinal image analysis. This is laborious and time-consuming; AI researchers are developing the U-Net model to automate this process. However, the U-Net model struggles to generalize its predictions across datasets due to variability in fundus images. To overcome these limitations, I propose a cross-domain Vector Quantized Variational Autoencoder (VQ-VAE) that is dataset-agnostic - regardless of the training dataset, the VQ-VAE can accurately classify vessel segmentations. The model does not have to be retrained for each different target dataset, eliminating the need for new data, resources, and time. The VQ-VAE consists of an encoder-decoder network with a custom discrete embedding space. The encoder's result is quantized through this embedding space then decoded to produce a segmentation mask. Both this VQ-VAE and a U-Net model were trained on the DRIVE dataset and tested on the DRIVE, IOSTAR, and CHASE_DB1 datasets. Both models were successful on the dataset they were trained on - DRIVE. However, the U-Net failed to generate vessel segmentation masks when tested with other datasets while the VQ-VAE performed with high accuracy. Quantitatively, the VQ-VAE performed well, having F1 scores from 0.758 to 0.767 across datasets. My model can produce convincing segmentation masks for new retinal image datasets without additional data, time, and resources. Applications include using the VQ-VAE after fundus image is taken to streamline the vessel segmentation process.

Full Text