Discriminative feature constraints via supervised contrastive learning for few-shot forest tree species classification using airborne hyperspectral images

Long Chen,Jing Wu,Yifan Xie,Erxue Chen,Xiaoli Zhang

doi:10.1016/j.rse.2023.113710

Abstract

In scenarios where sample collection is limited, studying few-shot learning algorithms such as prototypical networks (P-Net) is a keynote topic for supervised multiple tree species classification. In a previous study, we improved the P-Net by combining the feature enhancement algorithm based on the convolutional block attention module and several popular data augmentation methods in the computer vision domain, the classification accuracy can be significantly increased, and the degree of model overfitting can be reduced. However, there is a clear boundary between the data augmentations and the feature enhancement algorithm, which is manifested in that data augmentations are only used to enrich the diversity of the learned samples, but cannot directly affect the construction of the objective function, thus limiting the ability of data augmentations. In fact, in the supervised contrastive learning research, data augmentation methods are often used to generate positive samples of an anchor image to construct the objective function, i.e. supervised contrastive loss. The core idea for solving such a boundary problem is to use contrastive learning to make the anchor image close to its positive samples and the negative samples away from each other. Inspired by this, we introduced supervised contrastive learning in the P-Net, i.e., SCL-P-Net, which takes the discriminative feature representations as the constraints of the prototype clustering algorithm. In SCL-P-Net, data augmentation methods can not only extend the sample distribution, but also be used to construct the supervised contrastive loss directly. The study involves four airborne hyperspectral image datasets related to tree species classification, including the GFF-A and GFF-B datasets collected from Gaofeng Forest Farm in Nanning City, Guangxi Province, South China, the Xiongan dataset from Matiwan Village in Xiongan New Area, Hebei Province, North China, and the Tea Farm dataset from Fanglu Tea Farm in Changzhou City, Jiangsu Province, East China. The highest overall accuracy (OA) for the four datasets is 99.23% for GFF-A, 98.39% for GFF-B, 99.30% for Xiongan, and 99.54% for Tea Farm. It is concluded that the proposed SCL-P-Net classification framework can achieve multiple tree species classification with high-precision. Without changing the basic classification framework of P-Net, the introduction of supervised contrastive learning makes the combination of the data augmentations and the feature enhancement algorithm and plays a positive role in improving the distinguishability between samples.

Full Text