Evolving Neural Network Designs with Genetic Algorithms: Applications in Image Classification, NLP, and Reinforcement Learning
A method of evolving deep learning architectures using genetic algorithms is presented. The method is a first step towards a low-cost evolutionary search for task-specific neural networks. We evolve task-specific model architectures optimized for fast execution and low error on several standard machine learning tasks: image classification, character-level language modeling, and solving the cart pole problem. We also introduce a simple variation of the method that is capable of evolving neural networks with recurrent connections of varying depth and length and show performance on a word-level language modeling task. The method is implemented in an open-source library. We hope that the ability to run an evolutionary search at this scale will make it possible for a wide audience to develop deep learning architectures that are specialized for a variety of tasks and to develop many interesting novel architectural features. A new method that uses evolutionary search to directly modify existing neural network architectures to perform a specific task is presented. We demonstrate that task-specific specialization of deep learning models can be useful in practice. We modify convolutional neural networks, residual networks, and an LSTM variant to perform various tasks, and show that specialized networks often perform better than models trained from scratch that have many more parameters and much larger training time. For example, on the object recognition task, a specialized model is built by training a base network to predict object position and then applying a series of genetic search operations to squeeze the network and fit new final layer weights to the output. The specialized model is 8 times faster and has 13% lower error, despite being 17 times smaller than a fully trained larger and slower network.
- Research Article
25
- 10.1121/1.4768800
- Jan 1, 2013
- The Journal of the Acoustical Society of America
Mandarin Chinese is based on characters which are syllabic in nature and morphological in meaning. All spoken languages have syllabiotactic rules which govern the construction of syllables and their allowed sequences. These constraints are not as restrictive as those learned from word sequences, but they can provide additional useful linguistic information. Hence, it is possible to improve speech recognition performance by appropriately combining these two types of constraints. For the Chinese language considered in this paper, character level language models (LMs) can be used as a first level approximation to allowed syllable sequences. To test this idea, word and character level n-gram LMs were trained on 2.8 billion words (equivalent to 4.3 billion characters) of texts from a wide collection of text sources. Both hypothesis and model based combination techniques were investigated to combine word and character level LMs. Significant character error rate reductions up to 7.3% relative were obtained on a state-of-the-art Mandarin Chinese broadcast audio recognition task using an adapted history dependent multi-level LM that performs a log-linearly combination of character and word level LMs. This supports the hypothesis that character or syllable sequence models are useful for improving Mandarin speech recognition performance.
- Research Article
93
- 10.1109/3477.836376
- Apr 1, 2000
- IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics)
This paper proposes a TD (temporal difference) and GA (genetic algorithm)-based reinforcement (TDGAR) learning method and applies it to the control of a real magnetic bearing system. The TDGAR learning scheme is a new hybrid GA, which integrates the TD prediction method and the GA to perform the reinforcement learning task. The TDGAR learning system is composed of two integrated feedforward networks. One neural network acts as a critic network to guide the learning of the other network (the action network) which determines the outputs (actions) of the TDGAR learning system. The action network can be a normal neural network or a neural fuzzy network. Using the TD prediction method, the critic network can predict the external reinforcement signal and provide a more informative internal reinforcement signal to the action network. The action network uses the GA to adapt itself according to the internal reinforcement signal. The key concept of the TDGAR learning scheme is to formulate the internal reinforcement signal as the fitness function for the GA such that the GA can evaluate the candidate solutions (chromosomes) regularly, even during periods without external feedback from the environment. This enables the GA to proceed to new generations regularly without waiting for the arrival of the external reinforcement signal. This can usually accelerate the GA learning since a reinforcement signal may only be available at a time long after a sequence of actions has occurred in the reinforcement learning problem. The proposed TDGAR learning system has been used to control an active magnetic bearing (AMB) system in practice. A systematic design procedure is developed to achieve successful integration of all the subsystems including magnetic suspension, mechanical structure, and controller training. The results show that the TDGAR learning scheme can successfully find a neural controller or a neural fuzzy controller for a self-designed magnetic bearing system.
- Single Book
80
- 10.1007/3-540-45720-8
- Jan 1, 2001
Connectionist Models of Neurons, Learning Processes, and Artificial Intelligence
- Conference Article
- 10.1109/iccp53602.2021.9733482
- Oct 28, 2021
The detection of gravitational waves is considered to be one of the most magnificent discoveries of the century. Due to the high computational cost of matched filtering pipeline, there is a hunt for an alternative powerful system. I present the use of 1D residual neural network for detection of gravitational waves. Residual networks have transformed many fields like image classification, face recognition and object detection with their robust structure. With increase in sensitivity of LIGO detectors we expect many more sources of gravitational waves in the universe to be detected. However, deep learning networks are trained only once. When used for classification task, deep neural networks are trained to predict only a fixed number of classes. Therefore, when a new type of gravitational wave is to be detected, this turns out to be a drawback of deep learning. Shallow neural networks can be used to learn data with simple patterns but fail to give good results with increase in complexity of data. Remodelling the neural network with detection of each new type of GW is highly infeasible. In this letter, I also discuss ways to reduce the time required to adapt to such changes in detection of gravitational waves for deep learning methods. Primarily, I aim to create a custom residual neural network for 1-dimensional time series inputs, which can learn a ton of features from dataset without giving up on increasing the number of classes or increasing the complexity of data. I use two of the classes of binary coalesce signals (Binary Black Hole Merger and Binary Neutron Star Merger signals) detected by LIGO to check the performance of residual structure on gravitational waves detection.
- Research Article
129
- 10.1109/tip.2016.2567069
- May 11, 2016
- IEEE Transactions on Image Processing
Inspired by the popular deep learning architecture, deep stacking network (DSN), a specific deep model for polarimetric synthetic aperture radar (POLSAR) image classification is proposed in this paper, which is named Wishart DSN (W-DSN). First of all, a fast implementation of Wishart distance is achieved by a special linear transformation, which speeds up the classification of POLSAR image and makes it possible to use this polarimetric information in the following neural network (NN). Then, a single-hidden-layer NN based on the fast Wishart distance is defined for POLSAR image classification, which is named Wishart network (WN) and improves the classification accuracy. Finally, a multi-layer NN is formed by stacking WNs, which is in fact the proposed deep learning architecture W-DSN for POLSAR image classification and improves the classification accuracy further. In addition, the structure of WN can be expanded in a straightforward way by adding hidden units if necessary, as well as the structure of the W-DSN. As a preliminary exploration on formulating specific deep learning architecture for POLSAR image classification, the proposed methods may establish a simple but clever connection between POLSAR image interpretation and deep learning. The experiment results tested on real POLSAR image show that the fast implementation of Wishart distance is very efficient (a POLSAR image with 768 000 pixels can be classified in 0.53 s), and both the single-hidden-layer architecture WN and the deep learning architecture W-DSN for POLSAR image classification perform well and work efficiently.
- Research Article
1
- 10.20965/jaciii.1999.p0439
- Dec 20, 1999
- Journal of Advanced Computational Intelligence and Intelligent Informatics
Learning has long been and will continue to be a key issue in intelligent algorithms and systems design. Emulating the behavior and mechanisms of human learning by machines at such high levels as symbolic processing and such low levels as neuronal processing has long been a dominant interest among researchers worldwide. Neural networks, fuzzy logic, and evolutionary algorithms represent the three most active research areas. With advanced theoretical studies and computer technology, many promising algorithms and systems using these techniques have been designed and implemented for a wide range of applications. This Special Issue presents seven papers on learning in intelligent algorithms and systems design from researchers in Japan, China, Australia, and the U.S. <B>Neural Networks:</B> Emulating low-level human intelligent processing, or neuronal processing, gave birth of artificial neural networks more than five decades ago. It was hoped that devices based on biological neural networks would possess characteristics of the human brain. Neural networks have reattracted researchers' attention since the late 1980s when back-propagation algorithms were used to train multilayer feed-forward neural networks. In the last decades, we have seen promising progress in this research field yield many new models, learning algorithms, and real-world applications, evidenced by the publication of new journals in this field. <B>Fuzzy Logic:</B> Since L. A. Zadeh introduced fuzzy set theory in 1965, fuzzy logic has increasingly become the focus of many researchers and engineers opening up new research and problem solving. Fuzzy set theory has been favorably applied to control system design. In the last few years, fuzzy model applications have bloomed in image processing and pattern recognition. <B>Evolutionary Algorithms:</B> Evolutionary optimization algorithms have been studied over three decades, emulating natural evolutionary search and selection so powerful in global optimization. The study of evolutionary algorithms includes evolutionary programming (EP), evolutionary strategies (ESs), genetic algorithms (GAs), and genetic programming (GP). In the last few years, we have also seen multiple computational algorithms combined to maximize system performance, such as neurofuzzy networks, fuzzy neural networks, fuzzy logic and genetic optimization, neural networks, and evolutionary algorithms. This Special Issue also includes papers that introduce combined techniques. <B>Wang</B> et al present an improved fuzzy algorithm for enhanced eyeground images. Examination of the eyeground image is effective in diagnosing glaucoma and diabetes. Conventional eyeground image quality is usually too poor for doctors to obtain useful information, so enhancement is required to eliminate this. Due to details and uncertainties in eyeground images, conventional enhancement such as histogram equalization, edge enhancement, and high-pass filters fail to achieve good results. Fuzzy enhancement enhances images in three steps: (1) transferring an image from the spatial domain to the fuzzy domain; (2) conducting enhancement in the fuzzy domain; and (3) returning the image from the fuzzy domain to the spatial domain. The paper detailing this proposes improved mapping and fast implementation. <B>Mohammadian</B> presents a method for designing self-learning hierarchical fuzzy logic control systems based on the integration of evolutionary algorithms and fuzzy logic. The purpose of such an approach is to provide an integrated knowledge base for intelligent control and collision avoidance in a multirobot system. Evolutionary algorithms are used as in adaptation for learning fuzzy knowledge bases of control systems and learning, mapping, and interaction between fuzzy knowledge bases of different fuzzy logic systems. Fuzzy integral has been found useful in data fusion. <B>Pham and Wagner</B> present an approach based on the fuzzy integral and GAs to combine likelihood values of cohort speakers. The fuzzy integral nonlinearly fuses similarity measures of an utterance assigned to cohort speakers. In their approach, Gas find optimal fuzzy densities required for fuzzy fusion. Experiments using commercial speech corpus T146 show their approach achieves more favorable performance than conventional normalization. Evolution reflects the behavior of a society. <B>Puppala and Sen</B> present a coevolutionary approach to generating behavioral strategies for cooperating agent groups. Agent behavior evolves via GAs, where one genetic algorithm population is evolved per individual in the cooperative group. Groups are evaluated by pairing strategies from each population and best strategy pairs are stored together in shared memory. The approach is evaluated using asymmetric room painting and results demonstrate the superiority of shared memory over random pairing in consistently generating optimal behavior patterns. Object representation and template optimization are two main factors affecting object recognition performance. <B>Lu</B> et al present an evolutionary algorithm for optimizing handwritten numeral templates represented by rational B-spline surfaces of character foreground-background-distance distribution maps. Initial templates are extracted from training a feed-forward neural network instead of using arbitrarily chosen patterns to reduce iterations required in evolutionary optimization. To further reduce computational complexity, a fast search is used in selection. Using 1,000 optimized numeral templates, the classifier achieves a classification rate of 96.4% while rejecting 90.7% of nonnumeral patterns when tested on NIST Special Database 3. Determining an appropriate number of clusters is difficult yet important. <B>Li</B> et al based their approach based on rival penalized competitive learning (RPCL), addressing problems of overlapped clusters and dependent components of input vectors by incorporating full covariance matrices into the original RPCL algorithm. The resulting learning algorithm progressively eliminates units whose clusters contain only a small amount of training data. The algorithm is applied to determine the number of clusters in a Gaussian mixture distribution and to optimize the architecture of elliptical function networks for speaker verification and for vowel classification. Another important issue on learning is <B>Kurihara and Sugawara's</B> adaptive reinforcement learning algorithm integrating exploitation- and exploration-oriented learning. This algorithm is more robust in dynamically changing, large-scale environments, providing better performance than either exploitation- learning or exploration-oriented learning, making it is well suited for autonomous systems. In closing we would like to thank the authors who have submitted papers to this Special Issue and express our appreciation to the referees for their excellent work in reading papers under a tight schedule.
- Single Book
39
- 10.1007/bfb0098154
- Jan 1, 1999
Foundations and Tools for Neural Modeling
- Single Book
18
- 10.1007/978-3-642-01216-7
- Jan 1, 2009
The Sixth International Symposium on Neural Networks (ISNN 2009)
- Preprint Article
- 10.5194/egusphere-egu24-19531
- Mar 11, 2024
Gradient-Based Optimisers Versus Genetic Algorithms in Deep Learning Architectures: A Case Study on Rainfall Estimation Over Complex Terrain &#160; Yash Bhisikar1*, Nirmal Govindaraj1*, Venkatavihan Devaki2*, Ritu Anilkumar3 1Birla Institute of Technology And Science, Pilani, K K Birla Goa Campus&#160; 2Birla Institute of Technology And Science, Pilani, Pilani Campus&#160; 3North Eastern Space Applications Centre, Department of Space, Umiam E-mail: f20210483@goa.bits-pilani.ac.in * Authors have contributed equally to this study. Rainfall is a crucial factor that affects planning processes at various scales, ranging from agricultural activities at the village or residence level to governmental initiatives in the domains of water resource management, disaster preparedness, and infrastructural planning. Thus, a reliable estimate of rainfall and a systematic assessment of variations in rainfall patterns is the need of the hour. Recently, several studies have attempted to predict rainfall over various locations using deep learning architectures, including but not limited to artificial neural networks, convolutional neural networks, recurrent neural networks, or a combination of these. However, a major challenge in the estimation of rainfall is the chaotic nature of rainfall, especially the interplay of spatio-temporal components over orographically complex terrain. For complex computer vision challenges, studies have suggested that population search-driven optimisation techniques such as genetic algorithms may be used in the optimisation as an alternative to traditional gradient-based techniques such as Adam, Adadelta and SGD. Through this study, we aim to extend this hypothesis to the case of rainfall estimation. We integrate the use of population search-based techniques, namely genetic algorithms, to optimise a convolutional neural network architecture built using PyTorch. We have chosen the study area of North-East India for this study as it receives significant monsoon rainfall and is impacted by the undulating terrain that adds complexity to the rainfall estimation. We have used 30 years of rainfall data from the ERA5 Land daily reanalysis dataset with a spatial resolution of 11,132 m for the months of June, July, August and September. Additionally, datasets of the following meteorological variables that can impact rainfall were utilised as input features: dew point temperature, skin temperature, net incoming short-wave radiation received at the surface, wind components and surface pressure. All the datasets are aggregated to daily time steps. Several configurations of the U-Net architecture, such as the number of hidden layers, initialisation techniques and optimisation algorithms, have been used to identify the best configuration in the estimation of rainfall for North-East India. Genetic algorithms were used in initialisation and optimisation to assess the ability of population search heuristics using the PyGAD library. The developed rainfall prediction models were validated at different time steps (0-day, 1-day, 2-day and 3-day latency) on a 7:1:2 train, validation, test dataset split for evaluation metrics such as root mean square error (RMSE) and coefficient of determination (R-squared). The evaluation was performed on a pixel-by-pixel basis as well as an image-by-image basis in order to take magnitude and spatial correlations into consideration. Our study emphasises the importance of considering alternate optimising functions and hyperparameter tuning approaches for complex earth observation challenges such as rainfall prediction.
- Conference Article
5
- 10.1109/ivcnz51579.2020.9290492
- Nov 25, 2020
Convolutional neural networks (CNNs) have achieved great success in the image classification field in recent years. Usually, human experts are needed to design the architectures of CNNs for different tasks. Evolutionary neural network architecture search could find optimal CNN architectures automatically. However, the previous representations of CNN architectures with evolutionary algorithms have many restrictions. In this paper, we propose a new flexible representation based on the directed acyclic graph to encode CNN architectures, to develop a genetic algorithm (GA) based evolutionary neural network architecture, where the depth of candidate CNNs could be variable. Furthermore, we design new crossover and mutation operators, which can be performed on individuals of different lengths. The proposed algorithm is evaluated on five widely used datasets. The experimental results show that the proposed algorithm achieves very competitive performance against its peer competitors in terms of the classification accuracy and number of parameters.
- Conference Article
8
- 10.1145/3372938.3372945
- Oct 23, 2019
One of important functions of remote sensing data is producing the land-use/land-cover maps. Image classification is one of important applications for remote sensing imaginary. Machine learning (ML) techniques are the most widely used for this purpose in recent years. With the advent of computer vision thus, the need to deal with a large amount of data and avoiding any data redundancy, the deep learning techniques were appeared. Deep learning (DL) is a branch of machine learning that imitates the human brain structure and depends on the artificial neural networks (ANNs). Optimization of the neural networks is necessary for reduce the loss functions and avoiding any redundancy data in the training set, thus raise the accuracy. Genetic algorithms (GA) are the most widely used in the neural networks optimization, which considered as fully connected neural networks. Convolution neural networks (CNNs) are a branch of the artificial neural networks that are saving the computing cost and processing time. Thus, this paper presents a review of the deep learning algorithms specially the artificial neural networks, the genetic algorithm and the convolution neural networks. This paper also introduces a comparative study between the genetic algorithm and the convolution neural networks method. This comparison based on the overall accuracy (OA) and the kappa coefficient. This comparison shows that there are many conditions can affect the classifier accuracy. The results demonstrate that the CNNs algorithms are more accurate than the GA and in the other hand, the CNNs algorithms have lower computing cost.
- Single Book
33
- 10.1007/3-540-44869-1
- Jan 1, 2003
Artificial Neural Nets Problem Solving Methods
- Conference Article
1
- 10.1109/iccae56788.2023.10111236
- Mar 3, 2023
Lately, evolutionary algorithms have gained traction due to their ability to produce state-of-the-art deep learning architectures for a given data set, even though they require considerable amount of compute resources, they are a heavily researched domain because of the complexities involved in designing deep learning architectures. Currently, none of the evolutionary approaches available have incorporated the attention mechanism, which is a proven technique to improve the performance of image classification and language models. This paper posits a neuroevolutionary technique coupled with the use of Convolution Block Attention Module for image classification. As technology progresses, it’s inevitable that there will be massive advancements leading to cheaper and more available computing making evolutionary approaches a promising avenue to develop task specific deep learning models. The proposed approach evolves a topology that achieves a high fitness of 87.44%, using fewer parameters as compared to previous approaches. This results in a superior fitness score compared to most past approaches, despite being evolved for just few generations.
- Conference Article
164
- 10.3115/v1/n15-1038
- Jan 1, 2015
We present an approach to speech recognition that uses only a neural network to map acoustic input to characters, a character-level language model, and a beam search decoding procedure. This approach eliminates much of the complex infrastructure of modern speech recognition systems, making it possible to directly train a speech recognizer using errors generated by spoken language understanding tasks. The system naturally handles out of vocabulary words and spoken word fragments. We demonstrate our approach using the challenging Switchboard telephone conversation transcription task, achieving a word error rate competitive with existing baseline systems. To our knowledge, this is the first entirely neural-network-based system to achieve strong speech transcription results on a conversational speech task. We analyze qualitative differences between transcriptions produced by our lexicon-free approach and transcriptions produced by a standard speech recognition system. Finally, we evaluate the impact of large context neural network character language models as compared to standard n-gram models within our framework.
- Conference Article
82
- 10.1109/sips.2016.48
- Oct 1, 2016
In this paper, a neural network based real-time speech recognition (SR) system is developed using an FPGA for very low-power operation. The implemented system employs two recurrent neural networks (RNNs), one is a speech-tocharacter RNN for acoustic modeling (AM) and the other is for character-level language modeling (LM). The system also employs a statistical word-level LM to improve the recognition accuracy. The results of the AM, the character-level LM, and the word-level LM are combined using a fairly simple N-best search algorithm instead of the hidden Markov model (HMM) based network. The RNNs are implemented using massively parallel processing elements (PEs) for low latency and high throughput. The weights are quantized to 6 bits to store all of them in the on-chip memory of an FPGA. The proposed algorithm is implemented on a Xilinx XC7Z045, and the system can operate much faster than real-time.