Distillation Framework Research Articles

Minimizing the computation complexity is essential for the popularization of deep networks in practical applications. Nowadays, most researches attempt to accelerate deep networks by designing new network structure or compressing the network parameters. Meanwhile, transfer learning techniques such as knowledge distillation are utilized to keep the performance of deep models. In this paper, we focus on accelerating deep models and relieving the computation burden by using low-resolution (LR) images as inputs while maintaining competitive performance, which is rarely researched in the current literature. Deep networks may encounter serious performance degradation when using LR inputs because many details are unavailable from LR images. Besides, the existing approaches may fail to learn discriminative features for LR images because of the dramatic appearance variations between LR and high-resolution (HR) images. To tackle with the above problems, we propose a resolution-aware knowledge distillation (RKD) framework to narrow the cross-resolution variations by transferring knowledge from HR domain to LR domain. The proposed framework consists of a HR teacher network and a LR student network. First, we introduce a discriminator and propose an adversarial learning strategy to shrink the variations between inputs with changing resolution. Then we design a cross-resolution knowledge distillation (CRKD) loss to train discriminative student network by exploiting the knowledge of the teacher network. The CRKD loss is consisted of a resolution-aware distillation loss, a pair-wise constraint, and a maximum mean discrepancy loss. Experimental results on person re-identification, image classification, face recognition, and defect segmentation tasks demonstrate that RKD outperforms traditional knowledge distillation method by achieving better performance with lower computation complexities. Furthermore, CRKD surpasses the state-of-the-art knowledge distillation methods in transferring knowledge across different resolutions under RKD framework, especially when coping with large resolution differences.

Read full abstract

Abstract. Due to the proliferation of Earth Observation programmes, information at different spatial, spectral and temporal resolution is collected by means of various sensors (optical, radar, hyperspectral, LiDAR, etc.). Despite such abundance of information, it is not always possible to obtain a complete coverage of the same area (especially for large ones) from all the different sensors due to: (i) atmospheric conditions and/or (ii) acquisition cost. In this context of data (or modalities) misalignment, only part of the area under consideration could be covered by the different sensors (modalities). Unfortunately, standard machine learning approaches commonly employed in operational Earth monitoring systems require consistency between training and test data (i.e., they need to match the same information schema). Such a constraint limits the use of additional fruitful information, i.e., information coming from a particular sensor that may be available at training but not at test time. Recently, a framework able to manage such information misalignment between training and test information is proposed under the name of Generalized Knowledge Distillation (GKD). With the aim to provide a proof of concept of GKD in the context of multi-source Earth Observation analysis, here we provide a Generalized Knowledge Distillation framework for land use land cover mapping involving radar (Sentinel-1) and optical (Sentinel-2) satellite image time series data (SITS). Considering that part of the optical information may not be available due to bad atmospheric conditions, we make the assumption that radar SITS are always available (at both training and test time) while optical SITS are only accessible when the model is learnt (i.e., it is considered as privileged information). Evaluations are carried out on a real-world study area in the southwest of France, namely Dordogne, considering a mapping task involving seven different land use land cover classes. Experimental results underline how the additional (privileged) information ameliorates the results of the radar based classification with a main gain on the agricultural classes.

Read full abstract

Distillation Framework Research Articles

Related Topics

Articles published on Distillation Framework

BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection.

Positive-parity baryon spectrum and the role of hybrid baryons

Top-aware recommender distillation with deep reinforcement learning

DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval

Classifier-adaptation knowledge distillation framework for relation extraction and event detection with imbalanced data

Continual Learning for Named Entity Recognition

Lord of the Rings: Hanoi Pooling and Self-Knowledge Distillation for Fast and Accurate Vehicle Reidentification

Deep Cross-Modal Representation Learning and Distillation for Illumination-Invariant Pedestrian Detection

KDnet-RUL: A Knowledge Distillation Framework to Compress Deep Neural Networks for Machine Remaining Useful Life Prediction

Distillation at high momentum

Resolution-Aware Knowledge Distillation for Efficient Inference.

Double Similarity Distillation for Semantic Image Segmentation.

Mutual-learning sequence-level knowledge distillation for automatic speech recognition

Topological Space Knowledge Distillation for Compact Road Extraction in Optical Remote Sensing Images

Ensemble Learning of Lightweight Deep Learning Models Using Knowledge Distillation for Image Classification

GENERALIZED KNOWLEDGE DISTILLATION FOR MULTI-SENSOR REMOTE SENSING CLASSIFICATION: AN APPLICATION TO LAND COVER MAPPING

Online Knowledge Distillation with Diverse Peers

Knowledge Distillation in Acoustic Scene Classification

Improving Low-Resource Neural Machine Translation With Teacher-Free Knowledge Distillation

Progressive Motion Representation Distillation With Two-Branch Networks for Egocentric Activity Recognition

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Distillation Framework Research Articles

Related Topics

Articles published on Distillation Framework

BERTtoCNN: Similarity-preserving enhanced knowledge distillation for stance detection.

Positive-parity baryon spectrum and the role of hybrid baryons

Top-aware recommender distillation with deep reinforcement learning

DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval

Classifier-adaptation knowledge distillation framework for relation extraction and event detection with imbalanced data

Continual Learning for Named Entity Recognition

Lord of the Rings: Hanoi Pooling and Self-Knowledge Distillation for Fast and Accurate Vehicle Reidentification

Deep Cross-Modal Representation Learning and Distillation for Illumination-Invariant Pedestrian Detection

KDnet-RUL: A Knowledge Distillation Framework to Compress Deep Neural Networks for Machine Remaining Useful Life Prediction

Distillation at high momentum

Resolution-Aware Knowledge Distillation for Efficient Inference.

Double Similarity Distillation for Semantic Image Segmentation.

Mutual-learning sequence-level knowledge distillation for automatic speech recognition

Topological Space Knowledge Distillation for Compact Road Extraction in Optical Remote Sensing Images

Ensemble Learning of Lightweight Deep Learning Models Using Knowledge Distillation for Image Classification

GENERALIZED KNOWLEDGE DISTILLATION FOR MULTI-SENSOR REMOTE SENSING CLASSIFICATION: AN APPLICATION TO LAND COVER MAPPING

Online Knowledge Distillation with Diverse Peers

Knowledge Distillation in Acoustic Scene Classification

Improving Low-Resource Neural Machine Translation With Teacher-Free Knowledge Distillation

Progressive Motion Representation Distillation With Two-Branch Networks for Egocentric Activity Recognition