Abstract

Convolutional neural networks (CNNs) have recently been widely used in remote-sensing scene classification. Additionally, it is becoming very popular to automatically learn specific CNN architectures for specific data sets. The rich contextual information in high-resolution remote-sensing images (RSIs) is critical to remote-sensing intelligent understanding tasks. However, architecture learning approaches tend to simplify the original data (i.e., resizing images to smaller resolution) for efficiency, yet result in contextual information loss of RSIs. In this article, we proposed a contextual information-preserved architecture learning (CIPAL) framework for remote-sensing scene classification to utilize the contextual information in RSIs as much as possible during the architecture learning process. We introduce channel compression into CIPAL, which can reduce the memory and time consumption of architecture learning and make it possible to construct a larger architecture space. We add potential operators that are rarely used for scene classification tasks (i.e., atrous convolution) into the architecture space to explore unknown architectures that are more suitable for remote-sensing scenes. The experimental results on four remote-sensing scene classification benchmarks indicate that CIPAL learns architectures with less time consumption than similar works, and the newly found architectures outperform popular hand-designed architectures for better use of contextual information in RSIs. Different architectures are good at learning different representations, and our proposed architecture learning method potentially helps us understand which types of representations are crucial for RSI intelligent understanding.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call