Abstract
Convolutional neural networks (CNNs) represent the new reference approach for semantic segmentation of very-high-resolution (VHR) images, due to their ability to automatically capture semantic information while learning relevant features. However, as for most supervised methods, the map accuracy depends on the quantity and quality of ground truth (GT) used to train them. The use of densely annotated data (i.e., a detailed, exhaustive, pixel-level GT) allows to obtain effective CNN models but normally implies high efforts in annotation. Such ground truth is often available in benchmark datasets on which new methods are tested, but not on real data for land-cover applications, where only sparse annotations might be sufficiently cost effective. A CNN model trained with such incomplete GT maps has the tendency to smooth object boundaries because they are never precisely delineated in the GT. To cope with those shortcomings, we propose to exploit the intermediate activation maps of the CNN and to deploy a semisupervised fully connected conditional random field (CRF). In comparison with competitors using the same sparse annotations, the proposed method is able to better fill part of the performance gap compared to a CNN trained on the densely annotated, but generally unavailable, GTs.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Geoscience and Remote Sensing
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.