Abstract

Recently, deep neural networks (DNNs) have been used successfully in many fields, particularly, in medical diagnosis. However, deep learning (DL)modelsare expensive in terms of memory and computing resources, whichhinders their implementation in limited-resources devices or for delay-sensitive systems. Therefore, these deep models need to be accelerated and compressed to smaller sizes to be deployed on edge devices without noticeably affecting their performance. In this paper, recent accelerating and compression approaches of DNN are analyzed and compared regarding their performance, applications, benefits, and limitations with a more focus on the knowledge distillation approach as a successful emergent approach in this field. In addition, a framework is proposed to develop knowledge distilled DNN models that can be deployed on fog/edge devices for automatic disease diagnosis. To evaluate the proposed framework, two compressed medical diagnosis systems are proposed based on knowledge distillation deep neural models for both COVID-19 and Malaria. The experimental results show that these knowledge distilled models have been compressed by 18.4% and 15% of the original model and their responses accelerated by 6.14x and 5.86%, respectively, while there were no significant drop in their performance (dropped by 0.9% and 1.2%, respectively). Furthermore, the distilled models are compared with other pruned and quantized models. The obtained results revealed the superiority of the distilled models in terms of compression rates and response time.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call