Abstract

Although the application of deep learning in remote sensing (RS) has achieved fruitful results, systematic research on exploring the model performance and guiding the design of new convolutional neural network (CNN) architectures is still lacking. This subject is of great concern to researchers or practitioners in this field because existing CNN structures may not be adequate to deal with complex RS scenarios. In this study, an empirical formula of CNN model performance is delivered based on a literature review. Extensive experiments are conducted on six public RS data sets to investigate the influences of three architectural factors, namely, network depth, width, and cardinality. Two types of CNN architectures, i.e., VGG and ResNet, are adopted as baselines. We monitor and visualize the data distributions and gradients of the utilized CNNs to prevent the gradient vanishing or exploding problem. Grad-CAM is adopted to open the black box of CNNs and to illustrate the effects of adjusting architectural factors. Our experiments indicate that (1) increasing the network depth is beneficial to the semantic feature learning capacity of a CNN model, but excessive depth also leads to a decline of overall accuracy; (2) a partly widening strategy is effective because it can improve the model performance while maintaining the network complexity; and (3) network cardinality has huge potential in achieving a balance between model efficiency and accuracy. Suggestions for improving the CNN model performance and developing new structures are summarized in this paper.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call