Abstract

Abstract. The use of convolutional neural networks improved greatly data synthesis in the last years and have been widely used for data augmentation in scenarios where very imbalanced data is observed, such as land cover segmentation. Balancing the proportion of classes for training segmentation models can be very challenging considering that samples where all classes are reasonably represented might constitute a small portion of a training set, and techniques for augmenting this small amount of data such as rotation, scaling and translation might be not sufficient for efficient training. In this context, this paper proposes a methodology to perform data augmentation from few samples to improve the performance of CNN-based land cover semantic segmentation. First, we estimate the latent data representation of selected training samples by means of a mixture of Gaussians, using an encoder-decoder CNN. Then, we change the latent embedding used to generate the mixture parameters, at random and in training time, to generate new mixture models slightly different from the original. Finally, we compute the displacement maps between the original and the modified mixture models, and use them to elastically deform the original images, creating new realistic samples out of the original ones. Our disentangled approach allows the spatial modification of displacement maps to preserve objects where deformation is undesired, like buildings and cars, where geometry is highly discriminant. With this simple pipeline, we managed to augment samples in training time, and improve the overall performance of two basal semantic segmentation CNN architectures for land cover semantic segmentation.

Highlights

  • Land cover segmentation is a very common application of re- used to create latent data representations of a given set of samples, mote sensing and is of great interest in many fields, such as but the modification of such encoded data to derive new disagriculture and urban planning (Bokusheva et al, 2016)

  • Different land cover segmentation methods have been proposed in the literature, mostly based on object-based image analysis (Blaschke et al, 2014), and more recently on convolutional neural networks (Zhu et al, 2017)

  • We use an encoder-decoder CNN that takes as input an original image, encode it into a latent embedding and decode the embedding into a Gaussian Mixture Model (GMM) that best represents the input image

Read more

Summary

INTRODUCTION

While semantic segmentation is a widely discussed topic in the compute the displacement map of original and modified GMMs community (Yu et al, 2018, Buda et al, 2017), methods that using an elastic registration method, and use it to warp the orishow impressive results on this task still struggle with very ginal image, generating new samples. Much on the number of original samples available to balance the database, since these transformations derive a limited number of new samples Different models, such as Generative Adversarial Networks, our methodology, data used, pre-processing, methods for aughave been applied for balancing imbalanced training sets We draw conclusions from this work and the data created using them do not often add discriminating report future research possibilities Our experimental database, divided in tiles as detailed in the experimental design section, was composed by 10 four-channel normalized images for training, 3 for validation, and 3 for testing, with varied sizes

METHODOLOGY
Data and Pre-Processing
Data Augmentation Method
CNN semantic segmentation architectures
EXPERIMENTAL DESIGN
CONCLUSIONS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call