Abstract
Knowledge distillation has been used successfully to compress a large neural network (teacher) into a smaller neural network (student) by transferring the knowledge of the teacher network with its original training dataset. However, the original training dataset is not reusable in many real-world applications. To address this issue, data-free knowledge distillation, which is knowledge distillation in the absence of the original training datasets, has been studied. However, existing methods are limited to classification problems and cannot be directly applied to regression problems. In this study, we propose a novel data-free knowledge distillation method that is applicable to regression problems. Given a teacher network, we adopt a generator network to transfer the knowledge in the teacher network to a student network. We simultaneously train the generator and student networks in an adversarial manner. The generator network is trained to create synthetic data on which the teacher and student networks make different predictions, with the student network being trained to mimic the teacher network’s predictions. We demonstrate the effectiveness of the proposed method on benchmark datasets. Our results show that the student network emulates the prediction ability of the teacher network with little performance loss.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.