Abstract

Agricultural image segmentation needs to catch up to the development speed of deep learning, and the explosive computational overhead and limited high-quality labeled datasets are the main reasons preventing the application of Transformers to agricultural image segmentation. This study proposes a Simple-Attention Block (SIAB) using channel-by-channel and spatial convolutional computation, whose computational complexity is linearly correlated with the input image size. Then, we design a Simpleformer by cascading SIAB and FFN to reshape the Transformer architecture. Further, the fusion of CNN and Simpleformer constructs a dataset quality-independent agricultural image segmentation model (CS-Net). Finally, we evaluate CS-Net on four datasets, and compared with the state-of-the-art models, CS-Net has more advantageous inference speed and segmentation accuracy, which pushes the development of Transformers in the field of agricultural image processing. Additionally, we explore the reasons for the Transformers’ performance collapse for agricultural applications, providing research scholars with a theoretical foundation for related issues.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call