Abstract

The existence of fundus diseases not only endangers people’s vision, but also brings serious economic burden to the society. Fundus images are an objective and standard basis for the diagnosis of fundus diseases. With the continuous advancement of computer science, deep learning methods dominated by convolutional neural networks (CNN) have been widely used in fundus image classification. However, the current CNN-based fundus image classification research still has a lot of room for improvement: CNN cannot effectively avoid the interference of repeated background information and has limited ability to model the whole world. In response to the above findings, this paper proposes the CNN-Trans model. The CNN-Trans model is a parallel dual-branch network, which is the two branches of CNN-LSTM and Vision Transform (ViT). The CNN-LSTM branch uses Xception after transfer learning. As the original feature extractor, LSTM is responsible for dealing with the gradient disappearance problem in neural network iterations before the classification head, and then introduces a new type of lightweight attention mechanism between Xception and LSTM: Coordinate Attention, so as to emphasize the key information related to classification and suppress the less useful repeated background information; while the self-attention mechanism in the ViT branch is not limited by local interactions, it can establish long-distance dependence on the target and extract global features. Finally, the concatenation (Concat) operation is used to fuse the features of the two branches. The local features extracted by the CNN-LSTM branch and the global features extracted by the ViT branch form complementary advantages. After feature fusion, more comprehensive image feature information is sent to the to the classification layer. Finally, after a large number of experimental tests and comparisons, the results show that: the CNN-Trans model achieved an accuracy of 80.68% on the fundus image classification task, and the CNN-Trans model has a classification that is comparable to the state-of-the-art methods. performance..

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.