Comparison of Random Forest and Support Vector Machine Classifiers for Regional Land Cover Mapping Using Coarse Resolution FY-3C Images

Tesfaye Adugna,Jinlong Fan,Wenbo Xu

doi:10.3390/rs14030574

Abstract

The type of algorithm employed to classify remote sensing imageries plays a great role in affecting the accuracy. In recent decades, machine learning (ML) has received great attention due to its robustness in remote sensing image classification. In this regard, random forest (RF) and support vector machine (SVM) are two of the most widely used ML algorithms to generate land cover (LC) maps from satellite imageries. Although several comparisons have been conducted between these two algorithms, the findings are contradicting. Moreover, the comparisons were made on local-scale LC map generation either from high or medium resolution images using various software, but not Python. In this paper, we compared the performance of these two algorithms for large area LC mapping of parts of Africa using coarse resolution imageries in the Python platform by the employing Scikit-Learn (sklearn) library. We employed a big dataset, 297 metrics, comprised of systematically selected 9-month composite FegnYun-3C (FY-3C) satellite images with 1 km resolution. Several experiments were performed using a range of values to determine the best values for the two most important parameters of each classifier, the number of trees and the number of variables, for RF, and penalty value and gamma for SVM, and to obtain the best model of each algorithm. Our results showed that RF outperformed SVM yielding 0.86 (OA) and 0.83 (k), which are 1–2% and 3% higher than the best SVM model, respectively. In addition, RF performed better in mixed class classification; however, it performed almost the same when classifying relatively pure classes with distinct spectral variation, i.e., consisting of less mixed pixels. Furthermore, RF is more efficient in handling large input datasets where the SVM fails. Hence, RF is a more robust ML algorithm especially for heterogeneous large area mapping using coarse resolution images. Finally, default parameter values in the sklearn library work well for satellite image classification with minor/or no adjustment for these algorithms.

Highlights

This article is an open access articleLand cover (LC) information provides some of the most indispensable data in various sectors including environmental, ecological and climate change studies, and resource management and monitoring [1,2,3]
We aimed to compare the performance of these machine learning (ML) algorithms to generate a large area land cover by manipulating big input dataset of coarse resolution images obtained from the FengYun-3C (FY-3C) satellite
In this research, we considered the same study area, input datasets, and reference data as the previous work, as the main aim of this work is to evaluate the performance of the two ML classifiers for large area/regional land cover mapping using big datasets of coarse resolution imageries

Summary

Introduction

This article is an open access articleLand cover (LC) information provides some of the most indispensable data in various sectors including environmental, ecological and climate change studies, and resource management and monitoring [1,2,3]. It can be derived at different scales and broadly divided into three, i.e., either based on the areal extent it covers: local scale (covers a small area 100–103 km2 ), regional scales (104 –106 km2 ), and continental to global scales (>106 km2 ) [4] or according to its spatial resolution: coarse (≥1 km), moderate (1 km–100 m), and fine (

Objectives

Methods

Results

Discussion

Conclusion