Mapping Fluvial Landforms with Deep Similarity Learning

Patrice Carbonneau

doi:10.5194/egusphere-egu21-811

Abstract

&lt;p&gt;Semantic image classification as practised in Earth Observation is poorly suited to mapping fluvial landforms which are often composed of multiple landcover types such as water, riparian vegetation and exposed sediment. Deep learning methods developed in the field of computer vision for the purpose of image classification (ie the attribution of a single label to an image such as cat/dog/etc) are in fact more suited to such landform mapping tasks. Notably, Convolutional Neural Networks (CNN) have excelled at the task of labelling images. However, CNN are notorious for requiring very large training sets that are laborious and costly to assemble. Similarity learning is a sub-field of deep learning and is better known for one-shot and few-shot learning methods. These approaches aim to reduce the need for large training sets by using CNN architectures to compare a single, or few, known examples of an instance to a new image and determining if the new image is similar to the provided examples. Similarity learning rests on the concept of image embeddings which are condensed higher-dimension vector representations of an image generated by a CNN. Ideally, and if a CNN is suitably trained, image embeddings will form clusters according to image classes, even if some of these classes were never used in the initial CNN training.&lt;/p&gt;&lt;p&gt;&amp;#160;&lt;/p&gt;&lt;p&gt;In this paper, we use similarity learning for the purpose of fluvial landform mapping from Sentinel-2 imagery. We use the True Color Image product with a spatial resolution of 10 meters and begin by manually extracting tiles of 128x128 pixels for 4 classes: non-river, meandering reaches, anastomosing reaches and braiding reaches. We use the DenseNet121 CNN topped with a densely connected layer of 8 nodes which will produce embeddings as 8-dimension vectors. We then train this network with only 3 classes (non-river, meandering and anastomosing) using a categorical cross-entropy loss function. Our first result is that when applied to our image tiles, the embeddings produced by the trained CNN deliver 4 clusters. Despite not being used in the network training, the braiding river reach tiles have produced embeddings that form a distinct cluster. We then use this CNN to perform few-shot learning with a Siamese triplet architecture that will classify a new tile based on only 3 examples of each class. Here we find that tiles from the non-river, meandering and anastomising class were classified with F1 scores of 72%, 87% and 84%, respectively. The braiding river tiles were classified to an F1 score of 80%. Whilst these performances are lesser than the 90%+ performances expected from conventional CNN, the prediction of a new class of objects (braiding reaches) with only 3 samples to 80% F1 is unprecedented in river remote sensing. We will conclude the paper by extending the method to mapping fluvial landforms on entire Sentinel-2 tiles and we will show how we can use advanced cluster analyses of image embeddings to identify landform classes in an image without making a priori decisions on the classes that are present in the image.&lt;/p&gt;

Full Text