Diabetic retinopathy (DR) is a leading cause of blindness in humans, and regular screening is helpful to early detection and containment of DR. Therefore, automated and accurate lesion segmentation is crucial for DR grading and diagnosis. However, it is full of challenges due to complex structures, inconsistent scales and blurry edges of different kinds of lesions. To address the above issues, this paper proposes a cascaded context fusion and multi-attention network (CMNet) for multiple lesion segmentation of DR images. The CMNet includes triple attention module (TAM), cascaded context fusion module (CFM) and balanced attention module (BAM). Firstly, TAM is proposed to extract channel attention, spatial attention and pixel-point attention features for addition fusion to ensure selectivity and consistency of information representation. Moreover, CFM applies adaptive average pooling and non-local operation to capture local and global contextual features for concatenation fusion to expand the receptive field of lesions. Finally, BAM computes foreground, background and boundary attention maps of lesions, as well as uses Squeeze-and-Excitation block to weigh the feature channels and rebalance attention of three regions to make the network focus on the edge details for fine segmentation. We performed comprehensive comparison experiments and ablation studies on three public datasets of DDR, IDRiD and E-Ophtha, and mAUC of four lesions segmentation reached 0.6765, 0.7466 and 0.6710, respectively. The experimental results show that the proposed model outperforms current state-of-the-arts, which overcomes the adverse interference of background noise, and preserves contour details for realizing precise segmentation of multi-scale lesion.
Read full abstract