Abstract

In the implementation of traditional GP algorithm as models are evolved in a single deme (an environment in which a population of models is evolved) it may tend to produce sub-optimal models with poor generalisation skills due to lack of model diversity. As a solution to above issue, in this study the potential of evolving models in parallel multiple demes with different genetic attributes (parallel heterogeneous environments) and subsequent further evolution of some of the fittest models selected from each deme in another deme called the master deme was investigated, in relation to downscaling of large-scale climate data to daily minimum temperature (Tmin) and daily maximum temperature (Tmax). It was discovered that independent of the climate regime (i.e. warm or cold) and the geographic location of the observation station, a fraction of the fittest models (e.g. 25%) obtained from the last generation of each deme alone are sufficient for the formulation of a diverse initial population of models for the master deme. Also, independent of the climate regime and the geographic location of the observation station, both daily Tmin and Tmax downscaling models developed with the parallel multi-population genetic programming (PMPGP) algorithm showed better generalisation skills compared to that of models developed with the traditional single deme GP, even when the amount of redundant information in the data of predictors was high. The models developed for daily Tmin and Tmax with the PMPGP algorithm simulated fewer unphysically large outliers compared to that of models developed with the GP algorithm.

Highlights

  • For the simulation of the historical climate and the projection of climate into future general circulation models (GCMs) are widely used (Mujumdar and Kumar 2012).Electronic supplementary material The online version of this article contains supplementary material, which is available to authorized users.Though GCMs are able to adequately simulate the largescale climate (Wang et al 2015) since catchment-scale characteristics such as fine topographical features, land use, and convective processes are coarsely represented in their structures, they are not able to correctly simulate climate at catchment-scale (Chu and Yu 2010)

  • Similar patterns of performance were seen at all station for daily Tmin and Tmax. This indicated that the parallel multipopulation genetic programming (PMPGP) algorithm was insensitive to the migration policy, on condition that some of the fittest models in the last generation of each deme were allowed to migrate to the master deme

  • Independent of the climate regime and the geographic location of the observation station, downscaling models developed for both daily minimum temperature (Tmin) and daily maximum temperature (Tmax) with the parallel multi-population genetic programming (PMPGP) algorithm showed no sensitivity to migration policy, on condition that a fraction (e.g. 25%) of the fittest models in the last generation of each deme were allowed to migrate to the master deme

Read more

Summary

Introduction

For the simulation of the historical climate and the projection of climate into future general circulation models (GCMs) are widely used (Mujumdar and Kumar 2012). There is a need to further investigate on non-linear input selection approaches which can effectively reduce information redundancy, omit irrelevant information, reduce the dimensionality of input data and select an optimum set of predictors to statistical downscaling models. Among the plethora of regression techniques used in statistical downscaling, genetic programming (GP) which is inspired by Darwin’s theory of biological evolution can be considered as a unique technique as it; (1) models both linear and non-linear relationships, (2) identifies an optimum set of predictors while evolving predictors–predictand relationships, (3) produces explicit equations relating predictors to the predictand (predictors–predictand relationships) and (4) filters out irrelevant and redundant information in the set of predictors through evolution (Koza 1992). It was investigated whether PMPGP-based models generate such unphysically large outliers

Study area and data
Techniques
Genetic programming
Parallel multi population genetic programming
Methodology
Predictand selection
Delineation of an atmospheric domain
Probable predictor selection
Probable predictor–predictand association
Ranking of probable predictors and extraction of potential predictors
Reduction of data redundancies in large sets of predictors
Downscaling model development
PMPGP-based model development
GP-based model development
Results and discussion
Comparison of performance of GP and PMPGP-based downscaling models
Impact of increase in population size in GP algorithm on model performance
Caveats of the study
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call