As the demand for land use monitoring continues to grow, high-precision remote sensing products have become increasingly important. Compared to traditional methods, deep learning networks demonstrate significant advantages in automatic feature extraction, handling complex scenes, and improving classification accuracy. However, as the complexity of these networks increases, so does the computational cost. To address this challenge, we propose an innovative knowledge distillation model, integrating two key modules—spatial-global attention feature distillation (SGAFD) and channel attention-based relational distillation (CARD). This model enables a lightweight “student” network to be guided by a large “teacher” network, enhancing classification performance while maintaining a compact model size. We validated our approach on the large-scale public remote sensing datasets GID15 and LoveDA, and the results show that these modules effectively improve classification performance, overcoming the limitations of lightweight models and advancing the practical applications of land use monitoring.
Read full abstract