Evaluation of Light Gradient Boosted Machine Learning Technique in Large Scale Land Use and Land Cover Classification

Dakota Aaron Mccarty,Hye Kyung Lee,Hyun Woo Kim

doi:10.3390/environments7100084

Dakota Aaron Mccarty, Hye Kyung Lee + Show 1 more

Open Access

https://doi.org/10.3390/environments7100084

Copy DOI

Journal: Environments	Publication Date: Oct 3, 2020
Citations: 45	License type: CC BY 4.0

Affiliation: Incheon National University, Dankook University

Abstract

The ability to rapidly produce accurate land use and land cover maps regularly and consistently has been a growing initiative as they have increasingly become an important tool in the efforts to evaluate, monitor, and conserve Earth’s natural resources. Algorithms for supervised classification of satellite images constitute a necessary tool for the building of these maps and they have made it possible to establish remote sensing as the most reliable means of map generation. In this paper, we compare three machine learning techniques: Random Forest, Support Vector Machines, and Light Gradient Boosted Machine, using a 70/30 training/testing evaluation model. Our research evaluates the accuracy of Light Gradient Boosted Machine models against the more classic and trusted Random Forest and Support Vector Machines when it comes to classifying land use and land cover over large geographic areas. We found that the Light Gradient Booted model is marginally more accurate with a 0.01 and 0.059 increase in the overall accuracy compared to Support Vector and Random Forests, respectively, but also performed around 25% quicker on average.

Highlights

The classification of images acquired by remote sensing in the context of land use mapping consists of assigning each pixel of the image to a class
The highest Overall Accuracy (OA) was produced by LightGBM (0.653), closely followed by Support vector machines (SVM) (0.642) and
But powerful, Chi-squared test [41] we can compare the error matrices by investigating the equality in the overall distributions of variables predicted by one algorithm compared to another

Summary

Introduction

The classification of images acquired by remote sensing in the context of land use mapping consists of assigning each pixel of the image to a class. The set of classes selected to represent the scene of interest constitutes a taxonomy, or nomenclature, which varies according to the needs of the end user. The attribution of these classes is based on the visual analysis specific to the pixel and can be based on its visual description [1]. The supervised term comes from the training step of the algorithm, which consists of modeling the classes in play from a reference data set. Each class is modeled in relation to these attributes, whether they are generative or discriminative methods

Objectives

Methods

Results

Conclusion