Abstract

The capsule network is a novel architecture to encode feature attributes and spatial relationships of an image. By using the dynamic routing (DR) algorithm, a capsule network (CapsNet) model can be trained. However, the original three-layer CapsNet with the DR algorithm performs poorly on complex data sets, such as FashionMNIST, CIFAR-10, and CIFAR-100. This deficiency limits the wider application of capsule networks. In this article, we propose a deep capsule network model combined with a U-Net preprocessing module (DCN-UN). Local connection and weight-sharing strategies are adopted from convolutional neural networks to design a convolutional capsule layer in the DCN-UN model. This allows considerably reducing the number of parameters in the network model. Moreover, a greedy strategy is incorporated into the design of a mask DR (MDR) algorithm to improve the performance of network models. DCN-UN requires up to five times fewer parameters compared with the original CapsNet and other CapsNet-based models. The performance improvement achieved by the DCN-UN model with the MDR algorithm over the original CapsNet model with the DR algorithm is approximately 12% and 17% on the CIFAR-10 and CIFAR-100 data sets, respectively. The experimental results confirm that the proposed DCN-UN model allows preserving advantages of image reconstruction and equivariance mechanism in capsule networks. Moreover, an efficient initialization method is explored to enhance training stability and avoid gradient explosion.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call