Accurate iris segmentation is a crucial preprocessing stage for computer-aided ophthalmic disease diagnosis. The quality of iris images taken under different camera sensors varies greatly, and thus accurate segmentation of heterogeneous iris databases is a huge challenge. At present, network architectures based on convolutional neural networks (CNNs) have been widely applied in iris segmentation tasks. However, due to the limited kernel size of convolution layers, iris segmentation networks based on CNNs cannot learn global and long-term semantic information interactions well, and this will bring challenges to accurately segmenting the iris region. Inspired by the success of vision transformer (VIT) and swin transformer (Swin T), a hybrid deep learning approach is proposed to segment heterogeneous iris images. Specifically, we first proposed a bilateral segmentation backbone network that combines the benefits of Swin T with CNNs. Then, a multiscale feature information extraction module (MFIEM) is proposed to extract multiscale spatial information at a more granular level. Finally, a channel attention mechanism module (CAMM) is used in this paper to enhance the discriminability of the iris region. Experimental results on a multisource heterogeneous iris database show that our network has a significant performance advantage compared with some state-of-the-art (SOTA) iris segmentation networks.
Read full abstract