Abstract

Machine learning algorithms are crucial for crop identification and mapping. However, many works only focus on the identification results of these algorithms, but pay less attention to their classification performance and mechanism. In this paper, based on Google Earth Engine (GEE), Sentinel-2 10 m resolution images during a specific phenological period of winter wheat were obtained. Then, support vector machine (SVM), random forest (RF), and classification and regression tree (CART) machine learning algorithms were employed to identify and map winter wheat in a large-scale area. The hyperparameters of the three machine learning algorithms were tuned by grid search and the 5-fold cross-validation method. The classification performance of the three machine learning algorithms were compared, the results of which demonstrate that SVM achieves best performance in identifying winter wheat, and its overall accuracy (OA), user’s accuracy (UA), producer’s accuracy (PA), and kappa coefficient (Kappa) are 0.94, 0.95, 0.95, and 0.92, respectively. Moreover, 50 various combinations of training and validation sets were used to analyze the generalization ability of the algorithms, and the results show that the average OA of SVM, RF, and CART are 0.93, 0.92, and 0.88, respectively, thus indicating that SVM and RF are more robust than CART. To further explore the sensitivity of SVM, RF, and CART to variations of the algorithm parameters—namely, (C and gamma), (tree and split), and (maxD and minSP)—we employed the grid search method to iterate these parameters, respectively, and to analyze the effect of these parameters on the accuracy scores and classification residuals. It was found that with the change of (C and gamma) in (0.01~1000), SVM’s maximum variation of accuracy score is up to 0.63, and the maximum variation of residuals is 76,215 km2. We concluded that SVM is sensitive to the parameters (C and gamma) and presents a positive correlation. When the parameters (tree and split) change between (100~600) and (1~6), respectively, the RF’s maximum variation of accuracy score is 0.08, and the maximum variation of residuals is 1157 km2, indicating that RF is low in sensitivity toward the parameters (tree and split). When the parameters (maxD and minSP) are between (10~60), the maximum accuracy change value is 0.06, and the maximum variation of residuals is 6943 km2. Therefore, compared to RF, CART is sensitive to the parameters (maxD and minSP) and has poor robustness. In general, under the conditions of the hyperparameters, SVM and RF exhibit optimal classification performance, while CART has relatively inferior performance. Meanwhile, SVM, RF, and CART have different sensitivities toward the algorithm parameters; that is, SVM and CART are more sensitive to the algorithm parameters, while RF has low sensitivity toward changes in the algorithm parameters. The different parameters cause great changes in the accuracy scores and residuals, so it is necessary to determine the algorithm hyperparameters. Generally, default parameters can be used to achieve crop classification, but we recommend the enumeration method, similar to grid search, as a practical way to improve the classification performance of the algorithm if the best classification effect is expected.

Highlights

  • Wheat is one of the three major food crops across the world, providing a stable source of food and nutrition for humans [1]

  • Song [35] applied support vector machine (SVM) and artificial neural networks (ANN) to SPOT-5 image classification, and the results showed that SVM classification effect was slightly higher than ANN

  • The results showed that compared with NN and classification and regression tree (CART) algorithms, SVM has stronger generalization ability when the amount of data is small, and the highest accuracy is 83%

Read more

Summary

Introduction

Wheat is one of the three major food crops across the world, providing a stable source of food and nutrition for humans [1]. In addition to relying on the traditional spectral information, object-based image analysis (OBIA) [11], multi temporal information [12], phenology and other methods [13], more and more researchers are using machine learning algorithms for crop identification, such as support vector machine (SVM) [8], random forest (RF) [14], classification and regression tree (CART) [15], k-nearest neighbor (KNN) [16], neural networks (NN) [17], maximum likelihood (ML) [18] These algorithms can be used for classification quickly and effectively with hyperparameters. Crop identification or land use/cover research in large-scale regions often make use of low- and medium-resolution remote-sensing images, such as Advanced Very High Resolution Radiometer (AVHRR) and Moderate Resolution

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call