Abstract

Convolutional Neural Networks (CNNs) became the de-facto standard for medical image analysis. In CNN, pooling layers are used for downsampling feature maps by aggregating features from local regions, and hence help to learn invariant features and reduce computational complexity. Various pooling techniques have been proposed, among which average and max pooling are the widely used ones because of their simplicity. Since average pooling aggregates all the information of each region in each feature map, background features may dominate in the pooled representation. On the other hand, max pooling can capture noisy features as it focuses on the most activated features. To overcome this, researchers have proposed soft pooling techniques as intermediate to max and average pooling. In addition, various other pooling techniques also have been introduced for different purposes, which include techniques to reduce overfitting, to capture higher-order information such as the correlation between features, to capture spatial/structural information, etc. In this work, we investigate different pooling mechanisms with ResNet, a widely used CNN architecture, for the classification of HEp-2 cell images. We found that average pooling generally performs better than other pooling techniques such as max pooling, soft pooling, and bilinear pooling. By just changing the local pooling operation of the ResNet architecture from max to average we show over 2% of improvement in mean class accuracy (MCA). Overall, our approach gives an MCA of 88% for the classification of HEp-2 cell images.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call