Use of exploratory fitness landscape metrics to better understand the impact of model structure on the difficulty of calibrating artificial neural network models

S Zhu,H.R Maier,A.C Zecchin

doi:10.1016/j.jhydrol.2022.128093

Abstract

• ELA metrics can characterize the features of the error surfaces of ANNs. • Error surfaces of simple ANNs have a more well-defined overall shape. • Error surfaces of complex ANNs are flatter with many distributed, deep local optima. • Simple ANNs can be calibrated successfully using gradient-based methods. • Hybrid approach with global and local search are useful for complex ANN calibration. Artificial Neural Network (ANN) models have been used for hydrological and water resources modelling for several decades, where their calibration (“training”) has received much attention. The aim of automated calibration processes is to identify values of the model parameters that minimize an error measure between model outputs and corresponding measured values. The degree of difficulty associated with this process is a function of the nature of the relationship between changes in values of model parameters and corresponding changes in the error measure (i.e. the features of the error surface) and the automated method used to identify the combination of model parameters that minimizes the error measure (i.e. the optimization method used to find the global optimum in the error surface). While the impact of different optimization methods has been studied extensively, how the features of the error surface change for different model structures has not. In this paper, it is demonstrated that Exploratory Landscape Analysis (ELA) metrics can be used successfully to characterize the features of the error surfaces of ANN models, thereby helping to explain the reasons for an increase or decrease in calibration difficulty, and in doing so, shedding new light on findings in existing literature. Results show that the error surfaces of single-layer multi-layer perceptrons (MLPs) with fewer hidden nodes have a more well-defined overall shape and have fewer local optima, while the error surfaces of single-layer MLPs with a larger numbers of hidden nodes are flatter and have many distributed, deep local optima. Consequently, single-layer MLPs with a smaller number of hidden nodes can be calibrated successfully using gradient-based methods, such as the back-propagation algorithm, whereas single-layer MLPs with a relatively large number of hidden nodes are best calibrated using hybrid approaches that go beyond basic gradient-based methods (for example, combination of global-based evolutionary algorithms coupled with a local search).

Full Text