Abstract

Fine scale land cover classification of urban environments is important for a variety of applications. LiDAR data has been increasingly used, separately or in conjunction with other remote sensing data, for providing land cover classification due to its high geometric accuracy as well as its additional radiometric information. An important issue in the classification of remote sensing data is the inevitable imbalance of training samples, which usually results in poor classification performance in classes with few samples (minority classes). In this paper, a synergy of sampling techniques in data mining with ensemble classifiers is proposed to address the data imbalance problem in the training datasets. Several sampling strategies, including under-sampling the majority classes, synthetic over-sampling the minority classes, hybrid-sampling, and under-sampling aggregation are examined. The results from two different datasets show superior performance of ensemble classifiers when integrated with sampling techniques. In particular, under-sampling aggregation and hybrid sampling coupled with random forests resulted in 16.7% and 5.5% improvements in the G-mean measure in two experimental datasets examined.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call