Abstract

In order to improve the accuracy of capsule network in disentangled representation, and further expand its application in computer vision, a novel BDARS_CapsNet (bi-directional attention routing sausage capsule network) architecture is proposed in this paper. Firstly, the bi-directional routing, namely bottom-up and top-down attention is used to achieve information feed-forward and feedback mechanism, which contributes to describing the attributes of object entity more accurately and completely. Secondly, inspired by the concept of covering learning, the sausage measure model is introduced into the network. The sausage model measures both the similarities and differences of the capsules and projects them into a more complex curved surface, which makes it possible to approximate any nonlinear function with arbitrary precision and preserving the local responsiveness of capsule entity to the maximum. Finally, the BDARS_CapsNet combines the CNN (Convolutional Neural Network), bi-directional attention routing, and sausage measure into capsule network modeling, and makes full use of high-level category information and low-level vision information; as a result, the reconstruction and classification accuracy is accordingly improved. Experiments demonstrate the effectiveness of proposed information routing, sausage measure, and new framework. Furthermore, the proposed BDARS_CapsNet provides a foundation for future research on disentangled representation learning.

Highlights

  • Over the past few years, convolutional neural networks (CNN) have been developed rapidly

  • We propose a novel BDARS_CapsNet architecture, which utilizes two key concepts, i.e., bi-directional attention routing and sausage measure

  • When we replace the routing of the CapsNet with the proposed bi-directional attention mechanism and sausage measure, the accuracy is 99.44%, which beats the second best result with CapsNet-EM by a large margin

Read more

Summary

INTRODUCTION

Over the past few years, convolutional neural networks (CNN) have been developed rapidly. Xinyi and Chen [37] proposed a capsule graph network that utilizes an attention module to scale node embeddings followed by dynamic routing to generate graph capsules. A primary capsule is constructed as follows: convolution is conducted with 32 channels of a convolutional 8D capsule, each of which includes 8 convolutional units with kernel size of 9 × 9 and stride of 2; subsequently,bi-directional attention routing and sausage measure are performed and 32×6×6 capsules are obtained.

BI-DIRECTIONAL ATTENTION ROUTING
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call