We introduce two semi-supervised models for the classification of remote sensing image data. The models are built upon the framework of Virtual Support Vector Machines (VSVM). Generally, VSVM follow a two-step learning procedure: A Support Vector Machines (SVM) model is learned to determine and extract labeled samples that constitute the decision boundary with the maximum margin between thematic classes, i.e., the Support Vectors (SVs). The SVs govern the creation of so-called virtual samples. This is done by modifying, i.e., perturbing, the image features to which a decision boundary needs to be invariant. Subsequently, the classification model is learned for a second time by using the newly created virtual samples in addition to the SVs to eventually find a new optimal decision boundary. Here, we extend this concept by (i) integrating a constrained set of semi-labeled samples when establishing the final model. Thereby, the model constrainment, i.e., the selection mechanism for including solely informative semi-labeled samples, is built upon a self-learning procedure composed of two active learning heuristics. Additionally, (ii) we consecutively deploy semi-labeled samples for the creation of semi-labeled virtual samples by modifying the image features of semi-labeled samples that have become semi-labeled SVs after an initial model run. We present experimental results from classifying two multispectral data sets with a sub-meter geometric resolution. The proposed semi-supervised VSVM models exhibit the most favorable performance compared to related SVM and VSVM-based approaches, as well as (semi-)supervised CNNs, in situations with a very limited amount of available prior knowledge, i.e., labeled samples.
Read full abstract