Abstract
AbstractIn various fields, knowledge distillation (KD) techniques that combine vision transformers (ViTs) and convolutional neural networks (CNNs) as a hybrid teacher have shown remarkable results in classification. However, in the realm of remote sensing images (RSIs), existing KD research studies are not only scarce but also lack competitiveness. This issue significantly impedes the deployment of the notable advantages of ViTs and CNNs. To tackle this, the authors introduce a novel hybrid‐model KD approach named HMKD‐Net, which comprises a CNN‐ViT ensemble teacher and a CNN student. Contrary to popular opinion, the authors posit that the sparsity in RSI data distribution limits the effectiveness and efficiency of hybrid‐model knowledge transfer. As a solution, a simple yet innovative method to handle variances during the KD phase is suggested, leading to substantial enhancements in the effectiveness and efficiency of hybrid knowledge transfer. The authors assessed the performance of HMKD‐Net on three RSI datasets. The findings indicate that HMKD‐Net significantly outperforms other cutting‐edge methods while maintaining a significantly smaller size. Specifically, HMKD‐Net exceeds other KD‐based methods with a maximum accuracy improvement of 22.8% across various datasets. As ablation experiments indicated, HMKD‐Net has cut down on time expenses by about 80% in the KD process. This research study validates that the hybrid‐model KD technique can be more effective and efficient if the data distribution sparsity in RSIs is well handled.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have