Implementation Of Random Forests Research Articles

Methods for effective wetland monitoring are needed to understand how ecosystem services may be altered from past and present anthropogenic activities and recent climate change. The large extent of wetlands in many regions suggests remote sensing as an effective means for monitoring. Remote sensing approaches have shown good performance in local extent studies, but larger regional efforts have generally produced low accuracies for detailed classes. In this research we evaluate the potential of deep-learning Convolution Neural Networks (CNNs) for wetland classification using Landsat data to bog, fen, marsh, swamp, and water classes defined by the Canada Wetland Classification System (CWCS). The study area is the northern part of the forested region of Alberta where we had access to two reference data sources. We evaluated ResNet CNNs and developed a Multi-Size/Scale ResNet Ensemble (MSRE) approach that exhibited the best performance. For assessment, a spatial extension strategy was employed that separated regions for training and testing. Results were consistent between the two reference sources. The best overall accuracy for the CWCS classes was 62–68%. Compared to a pixel-based random forest implementation this was 5–7% higher depending on the accuracy measure considered. For a parameter-optimized spatial-based implementation this was 2–4% higher. For a reduced set of classes to water, wetland, and upland, overall accuracy was in the range of 86–87%. Assessment for sampling over the entire region instead of spatial extension improved the mean class accuracies (F1-score) by 9% for the CWCS classes and for the reduced three-class level by 6%. The overall accuracies were 69% and 90% for the CWCS and reduced classes respectively with region sampling. Results in this study show that detailed classification of wetland types with Landsat remains challenging, particularly for small wetlands. In addition, further investigation of deep-learning methods are needed to identify CNN configurations and sampling methods better suited to moderate spatial resolution imagery across a range of environments.

Read full abstract

Classification is a fundamental process in remote sensing used to relate pixel values to land cover classes present on the surface. Over large areas land cover classification is challenging particularly due to the cost and difficulty of collecting representative training data that enable classifiers to be consistent and locally reliable. A novel methodology to classify large volume Landsat data using high quality training data derived from the 500m MODIS land cover product is demonstrated and used to generate a 30m land cover classification for all of North America between 20°N and 50°N. Publically available 30m global monthly Web-enabled Landsat Data (GWELD) products generated from every available Landsat 7 ETM+ and Landsat 5 TM image for a three year period, that are defined aligned to the MODIS land products and are consistently pre-processed data (cloud-screened, saturation flagged, atmospherically corrected and normalized to nadir BRDF adjusted reflectance), were classified. The MODIS 500m land cover product was filtered judiciously, using only good quality pixels that did not change land cover class in 2009, 2010 or 2011, followed by automated selection of spatially corresponding 30m GWELD temporal metric values, to define a large training data set sampled across North America. The training data were sampled so that the class proportions were the same as the North America MODIS land cover product class proportions and corresponded to 1% of the 500m and <0.005% of the 30m pixels. Thirty nine GWELD temporal metrics for every 30m North America pixel location were classified using (a) a single random forest, and (b) a locally adaptive method with a random forest classifier derived and applied locally and the classification results spatially mosaicked together. The land cover classification results appeared geographically plausible and at synoptic scale were similar to the MODIS land cover product. Detailed visual inspection revealed that the locally adaptive random forest classifications and associated classification confidences were generally more coherent than the single random forest classification results. The level of agreement between the 30m classifications and the MODIS land cover product derived training data was assessed by bootstrapping the random forest implementation. The locally adaptive random forest classification had higher overall agreement (95.44%, 0.9443 kappa) than the single random forest classification (93.13%, 0.9195 kappa). The paper concludes with a discussion of future research including the potential for automated global land cover classification.

Read full abstract

Implementation Of Random Forests Research Articles

Related Topics

Articles published on Implementation Of Random Forests

Tree aggregation for random forest class probability estimation

FPGA-Based Network Traffic Classification Using Machine Learning

A Hybrid Set of Handwriting Features for Handwritten Recognition

Random forest implementation and optimization for Big Data analytics on LexisNexis\u2019s high performance computing cluster platform

The German Minimum Wage and Wage Growth: Heterogeneous Treatment Effects Using Causal Forests

A Brief Review of Random Forests for Water Scientists and Practitioners and Their Recent History in Water Resources

Assessment of Convolution Neural Networks for Wetland Mapping with Landsat in the Central Canadian Boreal Forest Region

Implementation of random forest in geo-acoustic study

Evaluating parameters for ligand-based modeling with random forest on sparse data sets

Random Forest Algorithm for Prediction of Precipitation

Tree-based Claims Algorithm for Measuring Pretreatment Quality of Care in Medicare Disabled Hepatitis C Patients.

Random Forests for Big Data

Using the 500 m MODIS land cover product to derive a consistent continental scale 30 m Landsat land cover classification

IntegratedMRF: random forest-based framework for integrating prediction from different data types.

Ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R

Practical aspects of gene regulatory inference via conditional inference forests from expression data.

CloudForest: A Scalable and Efficient Random Forest Implementation for Biological Data.

A random forest approach for competing risks based on pseudo‐values

Comparison of the CPU and memory performance of StatPatternRecognitions (SPR) and Toolkit for MultiVariate Analysis (TMVA)

Recursive partitioning on incomplete data using surrogate decisions and multiple imputation

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Implementation Of Random Forests Research Articles

Related Topics

Articles published on Implementation Of Random Forests

Tree aggregation for random forest class probability estimation

FPGA-Based Network Traffic Classification Using Machine Learning

A Hybrid Set of Handwriting Features for Handwritten Recognition

Random forest implementation and optimization for Big Data analytics on LexisNexis\u2019s high performance computing cluster platform

The German Minimum Wage and Wage Growth: Heterogeneous Treatment Effects Using Causal Forests

A Brief Review of Random Forests for Water Scientists and Practitioners and Their Recent History in Water Resources

Assessment of Convolution Neural Networks for Wetland Mapping with Landsat in the Central Canadian Boreal Forest Region

Implementation of random forest in geo-acoustic study

Evaluating parameters for ligand-based modeling with random forest on sparse data sets

Random Forest Algorithm for Prediction of Precipitation

Tree-based Claims Algorithm for Measuring Pretreatment Quality of Care in Medicare Disabled Hepatitis C Patients.

Random Forests for Big Data

Using the 500 m MODIS land cover product to derive a consistent continental scale 30 m Landsat land cover classification

IntegratedMRF: random forest-based framework for integrating prediction from different data types.

Ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R

Practical aspects of gene regulatory inference via conditional inference forests from expression data.

CloudForest: A Scalable and Efficient Random Forest Implementation for Biological Data.

A random forest approach for competing risks based on pseudo‐values

Comparison of the CPU and memory performance of StatPatternRecognitions (SPR) and Toolkit for MultiVariate Analysis (TMVA)

Recursive partitioning on incomplete data using surrogate decisions and multiple imputation