Training of Artificial Neural Networks Using Information-Rich Data

Shailesh Singh,Sharad Jain,András Bárdossy

doi:10.3390/hydrology1010040

Abstract

Artificial Neural Networks (ANNs) are classified as a data-driven technique, which implies that their learning improves as more and more training data are presented. This observation is based on the premise that a longer time series of training samples will contain more events of different types, and hence, the generalization ability of the ANN will improve. However, a longer time series need not necessarily contain more information. If there is considerable repetition of the same type of information, the ANN may not become “wiser”, and one may be just wasting computational effort and time. This study assumes that there are segments in a long time series that contain a large quantum of information. The reason behind this assumption is that the information contained in any hydrological series is not uniformly distributed, and it may be cyclic in nature. If an ANN is trained using these segments rather than the whole series, the training would be the same or better based on the information contained in the series. A pre-processing can be used to select information-rich data for training. However, most of the conventional pre-processing methods do not perform well due to large variation in magnitude, scale and many zeros in the data series. Therefore, it is not very easy to identify these information-rich segments in long time series with large variation in magnitude and many zeros. In this study, the data depth function was used as a tool for the identification of critical (information) segments in a time series, which does not depend on large variation in magnitude, scale or the presence of many zeros in data. Data from two gauging sites were used to compare the performance of ANN trained on the whole data set and just the data from critical events. Selection of data for critical events was done by two methods, using the depth function (identification of critical events (ICE) algorithm) and using random selection. Inter-comparison of the performance of the ANNs trained using the complete data sets and the pruned data sets shows that the ANN trained using the data from critical events, i.e., information-rich data (whose length could be one third to half of the series), gave similar results as the ANN trained using the complete data set. However, if the data set is pruned randomly, the performance of the ANN degrades significantly. The concept of this paper may be very useful for training data-driven models where the training time series is incomplete.

Highlights

IntroductionData-driven models (DDMs) try to infer the behaviour of a given system from the data presented for model training
As the name suggests, data-driven models (DDMs) try to infer the behaviour of a given system from the data presented for model training
Inter-comparison of the performance of the Artificial Neural Networks (ANNs) trained using the complete data sets and the pruned data sets shows that the ANN trained using the data from critical events, i.e., information-rich data, gave similar results as the ANN trained using the complete data set

Summary

Introduction

Data-driven models (DDMs) try to infer the behaviour of a given system from the data presented for model training. In a review paper on the present state-of-art approaches to ANN rainfall-runoff (R-R) modeling by Jain et al [11], there is the strong recommendation that there is a strong need to carry out extensive research on different aspects while developing ANN R-R models. These include input selection, data division, ANN training, hybrid modelling and extrapolation beyond the range of training data. They found both models to be suitable for the task, but they noted a limitation being the large number of parameters in an adaptive network based on a fuzzy system and large computational time in the genetic algorithm based on ANN

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Hydrology	Publication Date: Jul 22, 2014
Citations: 22	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

Training of Artificial Neural Networks Using Information-Rich Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Hydrology

Lead the way for us

Similar Papers

Online Time Series Changes Detection Based on Neuro-Fuzzy Approach
Yevgeniy Bodyanskiy ... Dmytro Peleshko
-
Yevgeniy Bodyanskiy, et. al.Yevgeniy Bodyanskiy ... Dmytro Peleshko
01 Jan 2019
01 Jan 2019

Improving the calibration strategy of the physically-based model WaSiM-ETH using critical events
Shailesh Kumar Singh ... András Bárdossy
Hydrological Sciences Journal | VOL. 57
Shailesh Kumar Singh, et. al.Shailesh Kumar Singh ... András Bárdossy
01 Nov 2012
Hydrological Sciences Journal | VOL. 57

Artificial Neural Network Architecture Design for EEG Time Series Simulation Using Chaotic System
Lei Zhang
-
Lei ZhangLei Zhang
01 Jun 2018
01 Jun 2018

Efficient Algorithms for Segmentation of Item-Set Time Series
Parvathi Chundi ... Daniel J Rosenkrantz
-
Parvathi Chundi, et. al.Parvathi Chundi ... Daniel J Rosenkrantz
01 Jan 2009
01 Jan 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Training of Artificial Neural Networks Using Information-Rich Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Hydrology