Recurrent Neural Networks for Edge Intelligence

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Recurrent Neural Networks are ubiquitous and pervasive in many artificial intelligence applications such as speech recognition, predictive healthcare, creative art, and so on. Although they provide accurate superior solutions, they pose a massive challenge “training havoc.” Current expansion of IoT demands intelligent models to be deployed at the edge. This is precisely to handle increasing model sizes and complex network architectures. Design efforts to meet these for greater performance have had inverse effects on portability on edge devices with real-time constraints of memory, latency, and energy. This article provides a detailed insight into various compression techniques widely disseminated in the deep learning regime. They have become key in mapping powerful RNNs onto resource-constrained devices. While compression of RNNs is the main focus of the survey, it also highlights challenges encountered while training. The training procedure directly influences model performance and compression alongside. Recent advancements to overcome the training challenges with their strengths and drawbacks are discussed. In short, the survey covers the three-step process, namely, architecture selection, efficient training process, and suitable compression technique applicable to a resource-constrained environment. It is thus one of the comprehensive survey guides a developer can adapt for a time-series problem context and an RNN solution for the edge.

Similar Papers
  • Dissertation
  • 10.17760/d20383685
High-performance and energy-efficient deep learning for resource-constrained devices
  • May 10, 2021
  • Ao Ren

Driven by the rapid development of deep neural networks (DNNs) in recent years, artificial intelligence applications have been flourishing in a spectrum of fields, such as image classification, object detection, machine translation, speech recognition, and smart homes. However, the enormous number of weight parameters and computations of DNN models require resource-rich devices, resulting in tremendous power and energy consumption. The sizes of the state-of-the-art DNN models are even increasingly massive, which further impede the deployment of DNNs in resource-constrained devices. This dissertation centers around addressing this challenge, and our efforts are classified into two directions: DNN model compression and hardware accelerator design. In pursuit of high-performance and energy-efficient DNN accelerator, investigating technologies beyond the conventional binary computing paradigm is desirable and we consider Stochastic computing (SC) a highly promising candidate. First, SC is a probabilistic computing paradigm, which uses a bit-sequence to represent a probability number by counting the number of ones in the sequence. This feature makes it suitable for DNN inference, which is essentially an approximate computing application and the final decision depends on the probabilities at the output layer. Second, SC is renowned as a footprint saver since many complex arithmetic operations can be implemented with simple logic components. For example, multiplication can be conducted with AND gates in SC. Consequently, these two fascinating features of SC make it a favorable alternative to conventional binary computing. In this dissertation, we present SC-DCNN, the first SC-based DCNN inference accelerator. Specifically, (i) we propose the design of diverse function blocks for the basic operations in DCNN. (ii) We propose the novel feature extraction blocks (FEBs), which are intended for extracting features from input feature maps. (iii) We propose comprehensive techniques to reduce the area and power (energy) consumption of weight storage. (iv) We propose holistic optimizations for the overall SC-DCNN architecture, with carefully selected layer-wise FEB configurations, to minimize area and power (energy) consumption while maintaining high network accuracy. Overall, the proposed SC-DCNN achieves the lowest hardware cost and energy consumption in implementing LeNet-5 compared with the state-of-the-art prior works. Besides emerging computing technologies, structured compression techniques, aiming at reducing the number of weights, and the corresponding accelerator design also require extensive research. Therefore, we study the block-circulant weight matrix (BCM)-based compression, which is suitable for serving this goal. BCM compression partitions the original weight matrix into blocks of square sub-matrices and each sub-matrix is trained into a circulant matrix. In a circulant matrix, each row vector can be produced by shifting its prior row vector by one element. Therefore, the whole matrix can be represented by only the first row vector, achieving significant storage and computation reduction as a result. The effectiveness of the algorithm has been verified on multiple representative DNN models, including both DCNNs for image classification and LSTMs for speech recognition. Moreover, we propose an ASIC accelerator design using the compression method. Experimental results show that the proposed BCM accelerator exhibits remarkable advantages in terms of power, throughput, and energy efficiency, indicating that this method is greatly desirable for resource-constrained devices running DNNs. In order to further boost compression ratios and advance energy-efficient deep learning, we propose ADMM-NN, a model-agnostic and systematic compression framework, unifying DNN pruning and quantization. In the proposed ADMM-NN, DNN compression is formulated as an optimization problem and solved using the alternating direction method of multipliers (ADMM). ADMM-NN first decomposes the optimization problem into two sub-problems. The first sub-problem is a neural network training problem with a regularization term, which regularizes the weight parameters to approach a specific compression pattern. The second problem is to find a local optimal compression pattern, which will then be fed back to the first problem. By iteratively solving the two relatively easy-to-solve sub-problems, a solution of the original problem can be found. Thereby, a high compression ratio can be obtained. Without accuracy loss, ADMM-NN achieved 85× and 24× pruning on LeNet-5 and AlexNet models, respectively. Combining weight pruning and quantization, we achieved 1,910× and 231× reductions in overall model size on these two benchmarks. Besides, 26× and 17.4× weight pruning ratios were observed on VGG-16 and ResNet-50, respectively. Furthermore, we propose a hardware-aware compression framework. Specifically, we studied the relationship between pruning ratios and speedups of running a pruned model, and the discovered relationship curve was then integrated into the framework to guide the pruning strategy. By applying the hardware-aware framework, ASIC synthesis results showed 3.6× overall speedup on conv1-conv5 layers of AlexNet. Based on ADMM-NN, we further propose a structured pruning algorithm for two reasons. First, ADMM-NN is a problem-solving framework that integrates neural network training and compression algorithm, but itself is not fixed to any specific compression algorithm. Therefore, we need to study an effective compression algorithm. Second, the conventional irregular pruning algorithm incurs high index and decoding overhead, and thus little acceleration can be achieved. Therefore, a pruning algorithm, which can produce a regular matrix structure and meanwhile is compatible with ADMM-NN, is also imperative. Since irregular pruning has been empirically proved to be able to achieve the highest compression ratios among a diverse variety of compression techniques, it is desirable to start the study by analyzing irregular pruning masks. We figured out three characteristics in the irregular pruning mask: (i) the number of retained weights in different rows varies significantly, and maintaining this variety helps sustain accuracy; (ii) denser rows are more sensitive to pruning than sparser rows; (iii) a block-max weight masking method is proposed to effectively sustain the overall salience of the weight matrix, meanwhile producing a high regularity. By leveraging the discovered characteristics, we propose the density-adaptive regular block (DARB) pruning, which can simultaneously achieve high compression ratios and high hardware performance. DARB was evaluated on five models, across three major application domains, and it outperforms the state-of-the-art prior generally by 2.4× to 4.8×. Besides, it achieves high decoding efficiency, which is defined as the number of activations selected for the corresponding retained weights per clock cycle. The hardware synthesis results showed that DARB outperforms block pruning with block size 4 × 4 and 8 × 8 by 14.3× and 3.6×, respectively. Meanwhile, DARB outperforms them on pruning ratios by 1.8× and 2.5×, respectively.

  • Research Article
  • Cite Count Icon 63
  • 10.1109/tcpmt.2021.3071351
Long Short-Term Memory Neural Networks for Modeling Nonlinear Electronic Components
  • Apr 6, 2021
  • IEEE Transactions on Components, Packaging and Manufacturing Technology
  • Mahvash Moradi A + 2 more

This article presents a new macromodeling approach for nonlinear electronic components and circuits based on long short-term memory (LSTM) neural network. LSTM proposes a more efficient training process in comparison with the conventional recurrent neural network (RNN) training. Conventional structures such as RNN suffer from the gradient vanishing problem during the training process. LSTM addresses this issue and solves this problem in an efficient way. In order to train the proposed structure, some input and output waveforms of the original circuit, called training waveforms, should be obtained from simulation tools or measurements. Model creation using proposed method does not require information on details inside the components and input-output training waveforms are sufficient to construct the model. The provided numerical results in this article show that the proposed method is more efficient than RNN techniques for modeling components and packages in terms of both speed and accuracy. The findings suggest that the proposed method significantly reduces the training time in comparison with conventional state-of-art modeling techniques. Furthermore, simulation time of the obtained model from the proposed technique is less than both conventional models (such as SPICE models) used in circuit simulation tools and models obtained from conventional RNN method. Three practical examples, namely, an audio amplifier, Texas Instruments (TIs) SN74AHCT540 device, and MOS inverter, are utilized to manifest the validity of the proposed macromodeling approach.

  • Research Article
  • 10.46632/cset/3/4/3
A Comparative Study of Recurrent Neural Network (RNN) with Gray Relational Analysis for Temporal Data
  • Dec 6, 2025
  • Computer Science, Engineering and Technology

A Recurrent Neural Network (RNN) is a specialized form of neural network that is adept at handling sequential data by retaining information from prior inputs. In contrast to conventional feedforward neural networks, RNNs incorporate loops in their architecture, allowing them to leverage data from previous time steps to affect the current output. This characteristic renders RNNs especially effective for applications that involve sequences, including time-series forecasting, natural language processing, and speech recognition. A fundamental component of RNNs is their hidden state, which acts as a dynamic memory that is refreshed with each incoming input. This allows RNNs to capture dependencies across time steps, which is crucial for understanding context in sequences. In language modeling, the interpretation of a word often relies on the words that come before it, a task that Recurrent Neural Networks (RNNs) handle well. However, RNNs struggle with issues like vanishing gradients, which hinder their ability to capture long-range dependencies. To overcome this, models such as Long Short-Term Memory (LSTM) networks and Gated Recurrent Units (GRUs) were introduced. These models incorporate gates that regulate the flow of information, allowing them to better learn long-term dependencies. RNNs remain a powerful tool for working with sequential data, facilitating the modeling of temporal relationships, but their effectiveness depends on careful design and optimization. Research significance: Recurrent Neural Networks (RNNs) hold significant research value because of their capacity to simulate temporal and sequential data, which is essential in many fields. They are frequently employed in natural language processing for tasks such as sentiment analysis, language translation, and text generation. In time-series analysis, RNNs enable accurate forecasting in finance, healthcare, and climate modeling. They also are essential in speech recognition and video processing, handling dependencies across time steps. Research focuses on improving RNNs, addressing challenges like vanishing gradients, and enhancing efficiency through architectures like LSTMs and GRUs, solidifying their relevance in advancing AI and machine learning applications. Methodology: A technique for analyzing the relationships between several variables, particularly in situations when data is limited or unclear, is called gray relational analysis, or GRA. In order to comprehend the relationships between variables, it evaluates how similar or different they are. GRA aids decision-makers in identifying critical factors, prioritizing actions, and improving processes in complex fields like engineering, finance, and management. By converting both qualitative and quantitative data into gray numbers, GRA addresses uncertainty and provides valuable insights for problem-solving, decision-making, and performance improvement, leading to more informed and effective strategies. Alternative taken as Simple RNN, LSTM, GRU, Bidirectional RNN, Deep RNN, Vanilla RNN, Echo State Network, Attention-based RNN, Transformer RNN, GRU with Attention. Evaluation preference taken as Prediction Accuracy, Model Robstness, Learning Efficiency, Training Time, Complexity. Attention-based RNN has the lowest score, Deep RNN has the highest rank, according to the results.

  • Conference Article
  • 10.1190/iwmg2021-27.1
Forward modeling and inversion based on deep learning by using an effective optimal nearly analytic discrete method
  • Feb 24, 2022
  • Lu Fan + 2 more

In this paper, we implement forward modeling and inversion based on deep learning strategies by using an effective optimal nearly analytic discrete (ONAD) method. The forward modeling method combines the ONAD method with a recurrent neural network (RNN). RNN is a kind of neural network suitable for sequential data, which uses information from both the previous and current time to obtain the output information. ONAD is an effective forward modeling method and is similar to RNN in that it uses the previous wavefield to calculate the current wavefield. Therefore, we express ONAD method through RNN framework to advance the time iteration of the acoustic equation. This can simplify the programming by using RNN and convolution kernels. Next, based on the proposed forward modeling method, we use deep learning to study full waveform inversion (FWI) problems. Since the main purpose of inversion is to minimize the error between real data and synthetic data, so inversion is essentially an optimization problem. There are many new optimization ideas in the framework of deep learning, such as Adam optimizer and Nadam optimizer, which can achieve better effectiveness of inversion than traditional optimizers used in FWI. We carry out six numerical experiments. The first two show the forward modeling results, which indicate that the forward modeling method can effectively suppress the numerical dispersion. The other four experiments show the inversion results. We compare several optimizers used in deep learning, and find that Nadam optimizer can almost restore the true velocity model with faster convergence and great effectiveness based on the ONAD method combined with RNN. These numerical experiments highlight the effectiveness of forward modeling and inversion based on deep learning by using ONAD method.

  • Conference Article
  • Cite Count Icon 5
  • 10.1109/vcip.2018.8698617
Synthesizing 3D Acoustic-Articulatory Mapping Trajectories: Predicting Articulatory Movements by Long-Term Recurrent Convolutional Neural Network
  • Dec 1, 2018
  • Lingyun Yu + 2 more

Robust and accurate predicting of articulatory movements has various important applications, such as 3D articulatory animations and visual communication. Various approaches have been proposed to solve the acoustic-articulatory mapping problem. However, their precision is not high enough. Recently, deep neural network (DNN), especially convolutional neural network (CNN) and recurrent neural network (RNN), has brought tremendous success in speech recognition and synthesis. To increase the accuracy, we propose a new network architecture for acoustic-articulatory mapping, called long-term recurrent convolutional neural network (LTRCNN). The network consists of CNN, RNN and a skip connection. CNN can model the spectral correlation among acoustic features efficiently. RNN, like long short-term memory (LSTM), can learn the temporal context information from sequential data powerfully. Besides, skip connections can increase the input representation from different levels to preserve the feature information. Experiments show that LTRCNN achieves the state-of-the-art root-mean-squared error (RMSE) with 0.690 mm and the correlation coefficient with 0.949 in this prediction task.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/sipnn.1994.344924
Recurrent sub neural networks applied to speech recognition
  • Apr 13, 1994
  • Wei-Ying Li + 3 more

Recurrent neural networks (RNNs) can be used to handle sequential patterns and have been used for speech recognition. To overcome the shortcomings of RNN, recurrent sub neural networks (RSNNs) are used, where an RSNN is built independently for each class. The training algorithm of the RSNN is based on the backpropagation algorithm. Speaker dependent connected Chinese digit-speech recognition experiments were carried out. Some factors influencing the performance of RSNNs have been studied. The experiments show that RSNN is easier to train and gives higher performance than RNN.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">&gt;</ETX>

  • Research Article
  • 10.54254/2755-2721/6/20230879
Comparative analysis between application of transformer and recurrent neural network in speech recognition
  • Jun 14, 2023
  • Applied and Computational Engineering
  • Zhehao Liao

Transformer is a deep learning model applying self-attention mechanism which is widely used in solving sequence-to-sequence questions, including speech recognition. After Transformer was proposed, it has been greatly developed and made great progress in the field of speech recognition. Recurrent Neural Network (RNN) is also a model that can be used in speech recognition. Speech recognition is a kind of sequence-to-sequence question that can transform human speech into text form. Both RNN and Transformer use encoder-decoder architecture to solve sequence-to-sequence questions. However, RNN is a recurrent model, weak in parallel training, and it will not perform quite well as Transformer in sequence-to-sequence question, which is a non-recurrent model. This paper mainly analyzes the accuracy Transformer and RNN in automatic speech recognition. It shows that Transformer performs better than RNN in speech recognition area, having higher accuracy, and it therefore provides evidence that Transformer can be an efficacious approach to automatic speech recognition as well as a practical substitution for traditional ways like RNN.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/asru.2017.8268936
Unsupervised adaptation of student DNNS learned from teacher RNNS for improved ASR performance
  • Dec 1, 2017
  • 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
  • Lahiru Samarakoon + 1 more

In automatic speech recognition (ASR), adaptation techniques are used to minimize the mismatch between training and testing conditions. Many successful techniques have been proposed for deep neural network (DNN) acoustic model (AM) adaptation. Recently, recurrent neural networks (RNNs) have outperformed DNNs in ASR tasks. However, the adaptation of RNN AMs is challenging and in some cases when combined with adaptation, DNN AMs outperform adapted RNN AMs. In this paper, we combine student-teacher training and unsupervised adaptation to improve ASR performance. First, RNNs are used as teachers to train student DNNs. Then, these student DNNs are adapted in an unsupervised fashion. Experimental results on the AMI IHM and AMI SDM tasks show that student DNNs are adaptable with significant performance improvements for both frame-wise and sequentially trained systems. We also show that the combination of adapted DNNs with teacher RNNs can further improve the performance.

  • Book Chapter
  • Cite Count Icon 25
  • 10.1007/978-3-030-69538-5_31
TinyGAN: Distilling BigGAN for Conditional Image Generation
  • Jan 1, 2021
  • Ting-Yun Chang + 1 more

Generative Adversarial Networks (GANs) have become a powerful approach for generative image modeling. However, GANs are notorious for their training instability, especially on large-scale, complex datasets. While the recent work of BigGAN has significantly improved the quality of image generation on ImageNet, it requires a huge model, making it hard to deploy on resource-constrained devices. To reduce the model size, we propose a black-box knowledge distillation framework for compressing GANs, which highlights a stable and efficient training process. Given BigGAN as the teacher network, we manage to train a much smaller student network to mimic its functionality, achieving competitive performance on Inception and FID scores with the generator having \(16\times \) fewer parameters. (The source code and the trained model are publicly available at https://github.com/terarachang/ACCV_TinyGAN).

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/iciscae48440.2019.221696
An Exploration of Recurrent Units for Automatic Speech Recognition with RNN based Acoustic Model
  • Sep 1, 2019
  • Huayang Zhang

Recurrent neural network (RNN) has become a popular technology for automatic speech recognition (ASR). However, the vanilla RNN is difficult to train due to the problem of vanishing gradient and thus has poor performance. Some units with gate mechanism have been proposed to solve the problem, such as gated recurrent unit (GRU), long short-term memory (LSTM), projected LSTM (LSTMP), projected GRU (PGRU) and output-gated PGRU (OPGRU). In this work, we aim to evaluate the performance of above RNN units for acoustic modeling in a Mandarin ASR task. We evaluate three conditions, including unidirectional RNN, bidirectional RNN (BRNN) and time delay neural network (TDNN) – RNN. The experiments were done on Aishell-1 corpus by using Kaldi toolkit. The results show that PGRU gets the best performance on all three conditions and its model size is also smaller than that of LSTM and LSTMP.

  • Research Article
  • Cite Count Icon 1
  • 10.11648/j.jccee.20251002.12
A Review on Aerospace-AI, with Ethics and Implications
  • Mar 11, 2025
  • Journal of Civil, Construction and Environmental Engineering
  • Derrick Mirindi + 3 more

The rapid advancement of aerospace technology, coupled with the exponential growth in available data, has catalyzed the integration of artificial intelligence (AI) across the aerospace sector. This comprehensive review examines the state-of-the-art applications of AI, machine learning (ML), deep learning (DL), and generative artificial intelligence (GenAI) in aerospace. Our analysis reveals that ML algorithms demonstrate remarkable capabilities: Random forest (RF) algorithm achieves precision within 10 meters for trajectory prediction, while support vector machines (SVMs) algorithms show 99.89% accuracy in aircraft fault detection. Decision trees (DTs) algorithms excel in aircraft system diagnostics with adaptive learning capabilities. In the realm of deep learning, convolutional neural networks (CNNs) algorithms achieve 79% accuracy in satellite component detection and structural inspection, while recurrent neural networks (RNNs) algorithms and Long Short-Term Memory (LSTM) networks demonstrate superior performance in 4D trajectory prediction and engine health monitoring. GenAI, particularly through Generative adversarial networks (GANs), has revolutionized airfoil design optimization, achieving less than 1% error in profile fitting and 10% error in aerodynamic stealth characteristics. However, these algorithms face scalability challenges when processing large-scale datasets in real-time applications, particularly in mission-critical scenarios. Our research also identifies four ethical considerations, including bias prevention in automated systems, transparency in decision-making processes, privacy protection in data handling, and the implementation of important safety protocols. This study provides a foundation for understanding the current landscape of aerospace-AI integration while highlighting the importance of addressing ethical implications in future developments. The successful implementation of these technologies will require continuous innovation in validation methodologies, establish universal ethical considerations standard, and enhanced community engagement through citizen science initiatives to involve stakeholders.

  • Single Book
  • Cite Count Icon 80
  • 10.1007/3-540-45720-8
Connectionist Models of Neurons, Learning Processes, and Artificial Intelligence
  • Jan 1, 2001
  • José Mira + 1 more

Connectionist Models of Neurons, Learning Processes, and Artificial Intelligence

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 3
  • 10.32604/csse.2022.024214
Enhanced Marathi Speech Recognition Facilitated by Grasshopper Optimisation-Based Recurrent Neural Network
  • Jan 1, 2022
  • Computer Systems Science and Engineering
  • Ravindra Parshuram Bachate + 5 more

Communication is a significant part of being human and living in the world. Diverse kinds of languages and their variations are there; thus, one person can speak any language and cannot effectively communicate with one who speaks that language in a different accent. Numerous application fields such as education, mobility, smart systems, security, and health care systems utilize the speech or voice recognition models abundantly. Though, various studies are focused on the Arabic or Asian and English languages by ignoring other significant languages like Marathi that leads to the broader research motivations in regional languages. It is necessary to understand the speech recognition field, in which the major concentrated stages are feature extraction and classification. This paper emphasis developing a Speech Recognition model for the Marathi language by optimizing Recurrent Neural Network (RNN). Here, the preprocessing of the input signal is performed by smoothing and median filtering. After preprocessing the feature extraction is carried out using MFCC and Spectral features to get precise features from the input Marathi Speech corpus. The optimized RNN classifier is used for speech recognition after completing the feature extraction task, where the optimization of hidden neurons in RNN is performed by the Grasshopper Optimization Algorithm (GOA). Finally, the comparison with the conventional techniques has shown that the proposed model outperforms most competing models on a benchmark dataset.

  • Research Article
  • Cite Count Icon 2
  • 10.12928/biste.v5i4.9668
Comparative Analysis of MLP, CNN, and RNN Models in Automatic Speech Recognition: Dissecting Performance Metric
  • Jan 8, 2024
  • Buletin Ilmiah Sarjana Teknik Elektro
  • Abraham K S Lenson + 1 more

This study conducts a comparative analysis of three prominent machine learning models: Multi-Layer Perceptrons (MLP), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN) with Long Short-Term Memory (LSTM) in the field of automatic speech recognition (ASR). This research is distinct in its use of the LibriSpeech 'test-clean' dataset, selected for its diversity in speaker accents and varied recording conditions, establishing it as a robust benchmark for ASR performance evaluation. Our approach involved preprocessing the audio data to ensure consistency and extracting Mel-Frequency Cepstral Coefficients (MFCCs) as the primary features, crucial for capturing the nuances of human speech. The models were meticulously configured with specific architectural details and hyperparameters. The MLP and CNN models were designed to maximize their pattern recognition capabilities, while the RNN (LSTM) was optimized for processing temporal data. To assess their performance, we employed metrics such as precision, recall, and F1-score. The MLP and CNN models demonstrated exceptional accuracy, with scores of 0.98 across these metrics, indicating their effectiveness in feature extraction and pattern recognition. In contrast, the LSTM variant of RNN showed lower efficacy, with scores below 0.60, highlighting the challenges in handling sequential speech data. The results of this study shed light on the differing capabilities of these models in ASR. While the high accuracy of MLP and CNN suggests potential overfitting, the underperformance of LSTM underscores the necessity for further refinement in sequential data processing. This research contributes to the understanding of various machine learning approaches in ASR and paves the way for future investigations. We propose exploring hybrid model architectures and enhancing feature extraction methods to develop more sophisticated, real-world ASR systems. Additionally, our findings underscore the importance of considering model-specific strengths and limitations in ASR applications, guiding the direction of future research in this rapidly evolving field.

  • Research Article
  • Cite Count Icon 244
  • 10.1023/b:warm.0000024727.94701.12
River Flow Forecasting using Recurrent Neural Networks
  • Apr 1, 2004
  • Water Resources Management
  • D Nagesh Kumar + 2 more

Forecasting a hydrologic time series has been one of the most complicated tasks owing to the wide range of data, the uncertainties in the parameters influencing the time series and also due to the non availability of adequate data. Recently, Artificial Neural Networks (ANNs) have become quite popular in time series forecasting in various fields. This paper demonstrates the use of ANNs to forecast monthly river flows. Two different networks, namely the feed forward network and the recurrent neural network, have been chosen. The feed forward network is trained using the conventional back propagation algorithm with many improvements and the recurrent neural network is trained using the method of ordered partial derivatives. The selection of architecture and the training procedure for both the networks are presented. The selected ANN models were used to train and forecast the monthly flows of a river in India, with a catchment area of 5189 km2 up to the gauging site. The trained networks are used for both single step ahead and multiple step ahead forecasting. A comparative study of both networks indicates that the recurrent neural networks performed better than the feed forward networks. In addition, the size of the architecture and the training time required were less for the recurrent neural networks. The recurrent neural network gave better results for both single step ahead and multiple step ahead forecasting. Hence recurrent neural networks are recommended as a tool for river flow forecasting.

Save Icon
Up Arrow
Open/Close