Abstract

Deep neural networks perform better than shallow neural networks, but the former tends to be deeper or wider, introducing large numbers of parameters and computations. We know that networks that are too wide have a high risk of overfitting and networks that are too deep require a large amount of computation. This paper proposed a narrow-deep ResNet, increasing the depth of the network while avoiding other issues caused by making the network too wide, and used the strategy of knowledge distillation, where we set up a trained teacher model to train an unmodified, wide, and narrow-deep ResNet that allows students to learn the teacher’s output. To validate the effectiveness of this method, it is tested on Cifar-100 and Pascal VOC datasets. The method proposed in this paper allows a small model to have about the same accuracy rate as a large model, while dramatically shrinking the response time and computational effort.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.