Abstract

Modern deep convolutional neural networks(CNNs) are often designed to be scalable, leading to the model family concept. A model family is a large (possibly infinite) collection of related neural network architectures. The isomorphism of a model family refers to the fact that the models within it share the same high-level structure. Meanwhile, the models within the model family are called isomorphic models for each other. Existing weight initialization methods for CNNs use random initialization or data-driven initialization. Even though these methods can perform satisfactory initialization, the isomorphism of model families is rarely explored. This work proposes an isomorphic model-based initialization method (IM Init) for CNNs. It can initialize any network with another well-trained isomorphic model in the same model family. We first formulate the widely used general network structure of CNNs. Then a structural weight transformation is presented to transform the weight between two isomorphic models. Finally, we apply our IM Init to the model down-sampling and up-sampling scenarios and confirm its effectiveness in improving accuracy and convergence speed through experiments on various image classification datasets. In the model down-sampling scenario, IM Init initializes the smaller target model with a larger well-trained source model. It improves the accuracy of RegNet200MF by 1.59% on the CIFAR-100 dataset and 1.9% on the CUB200 dataset. Inversely, IM Init initializes the larger target model with a smaller well-trained source model in the model up-sampling scenario. It significantly speeds up the convergence of RegNet600MF and improves the accuracy by 30.10% under short training schedules. Code will be available.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.