Abstract
Computer vision, as a part of machine learning, gains significant attention from researches nowadays. Aerial scene classification is a prominent chapter of computer vision with a vast application: military, surveillance and security, environment monitoring, detection of geospatial objects, etc. There are several publicly available remote sensing image datasets, which enable the deployment of various aerial scene classification algorithms. In our article, we use transfer learning from pre-trained deep Convolutional Neural Networks (CNN) within remote sensing image classification. Neural networks utilized in our research are high-dimensional previously trained CNN on ImageNet dataset. Transfer learning can be performed through feature extraction or fine-tuning. We proposed a two-stream feature extraction method and afterward image classification through a handcrafted classifier. Fine-tuning was performed with adaptive learning rates and a regularization method label smoothing. The proposed transfer learning techniques were validated on two remote sensing image datasets: WHU RS datasets and AID dataset. Our proposed method obtained competitive results compared to state-of-the-art methods.
Highlights
Scene classification is a process of assigning a semantic label to remote sensing (RS) images [1, 2]
The feature extraction transfer learning method was evaluated on the WHU-RS data set
ResNet50 average pooling layer gives significant classification results when it is combined with DenseNet121 convolutional layers, the last or the intermediate ones; - For the fine-tuning method, under 50% training data ratio linear learning rate decay scheduler gives better classification results for ResNet50 and Inception V3 pre-trained networks, and cyclical learning rates are a better choice for Xception and DenseNet121
Summary
Scene classification is a process of assigning a semantic label to remote sensing (RS) images [1, 2]. The problem of aerial scene classification is complex because the composition of remote sensing images is compound, and it is rich in features: space and texture. This is the reason for developing numerous scene classification methods. Authors in [12] use completed local binary patterns with multi-scales (MS-CLBP) and achieved stateof-the-art-results compared to other methods based on low-level image features
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.