Abstract

Decision tree (DT) algorithms have been applied for classification and change detection in various geospatial studies and more recently, for urban expansion and land use/land cover (LULC) change modeling. However, these studies have not elaborated on specification of DT algorithms regarding data sampling, predictor variables, model configuration, and model evaluation. The focus of this study is to explore several balanced and unbalanced sampling methods, various predictor variables, different configurations of stopping rules, and reliable evaluation metrics to enhance the performance of classification and regression tree (CART), one of the most efficacious DT algorithms, for urban expansion modeling. The implementation of the model in the Triangle Region, North Carolina (NC) State, over the period of 2001 to 2011 demonstrates a striking performance with the training accuracy of 97%, the testing accuracy of 94%, and the Kappa value of 0.80. This performance was achieved using a training dataset containing all changed land cells and three times of that randomly selected from unchanged land cells and regulating the minimum number of records in a leaf node equal to 1, the minimum number of records in a parent node equal to 2, and the value of 10,000 for the maximum number of splits. The CART DT algorithm indicates that proximity to built areas, proximity to highways, current LULC type, elevation, and distance to water bodies are the most significant predictor variables for the urban expansion prediction in the study area.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call