Currently, due to the limited amount of data and the difficulty of designing a network, there are few papers on constructing a new convolutional neural network for scene classification using the publicly available datasets of high-resolution remote sensing images. Considering the existing problems, the current scene classification methods of high-resolution remote sensing images are summarized, and the IMFNet model is constructed to classify scenes of high-resolution remote sensing images in this paper. The IMFNet is an end-to-end network, which can learn features from data automatically. The main characteristic of the IMFNet network structure is that the Inception module is used to extract the details of remote sensing images and the multifeature fusion strategy is proposed to ensure the integrity of information. In addition, optimization methods are adopted to improve the classification accuracy. In order to verify the effectiveness of the method proposed in this paper, the two benchmark datasets—the UC Merced dataset and the SIRI-WHU dataset were adopted for experiments. The classification accuracy of the two datasets reaches 92.14% and 90.43%, respectively. Experimental results show that the method proposed has certain advantages over the classification methods based on low-level and middle-level visual features and even some classification methods based on high-level visual features.
Read full abstract