Abstract

Convolutional neural networks (CNNs) have proven to be very successful in learning task-specific computer vision features. To integrate features from different layers in standard CNNs, we present a fusing framework of shortcut convolutional neural networks (S-CNNs). This framework can fuse arbitrary scale features by adding weighted shortcut connections to the standard CNNs. Besides the framework, we propose a shortcut indicator (SI) of binary string to stand for a specific S-CNN shortcut style. Additionally, we design a learning algorithm for the proposed S-CNNs. Comprehensive experiments are conducted to compare its performances with standard CNNs on multiple benchmark datasets for different visual tasks. Empirical results show that if we choose an appropriate fusing style of shortcut connections with learnable weights, S-CNNs can perform better than standard CNNs regarding accuracy and stability in different activation functions and pooling schemes initializations, and occlusions. Moreover, S-CNNs are competitive with ResNets and can outperform GoogLeNet, DenseNets, Multi-scale CNN, and DeepID.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call