Abstract

Convolutional neural networks (CNNs) have achieved extraordinary success on many image classification tasks in recent years. The use of dilated convolution in a CNN can increase the network’s receptive field and improve its performance, and dilated convolution can also be used to compress a CNN to realize a lightweight model. In previous studies, multiscale dilated convolution has been adopted with a focus on improving the internal network structure of a specific CNN model. Because they enable the direct use of pretrained models, transfer learning CNNs (TL-CNNs) have been widely applied for image recognition based on small datasets. This paper proposes a novel multiscale dilated-convolution-based ensemble learning (MDCEL) method for effectively improving the performance of a pretrained CNN model. The authors' primary assumption is that semantic representations of different images can be obtained based on multiscale dilated convolution. Therefore, constructing an ensemble of diverse TL-CNN classifiers makes it possible to achieve higher performance than that offered by the traditional TL-CNN methods. The MDCEL method is highly versatile and can be applied to various conventional pretrained CNN models and lightweight CNN models. Moreover, this method does not require the modification of the internal structures of the pretrained CNN models and has high training efficiency. Experimental results on three public image classification datasets demonstrate that the proposed method outperforms the baseline traditional TL-CNN method. Compared with the baseline approach, the MDCEL approach improves the accuracy and F1 values by nearly 1–4%. In addition, an experiment on a real case dataset obtained from a manufacturing enterprise further proves the practicability of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call