Predicting transient thermo-hydraulic parameters of fast reactor cores using two-layer decomposition hybrid neural network
Predicting transient thermo-hydraulic parameters of fast reactor cores using two-layer decomposition hybrid neural network
- Research Article
- 10.22067/jam.v5i2.28513
- Jun 12, 2014
- SHILAP Revista de lepidopterología
Introduction: Grading agricultural products always has a particular important position for submission to domestic and overseas markets. The grading causes more profitable product ranges and customer satisfaction. Grading treatment is carried out based on various parameters such as color, ripeness level, dimensions and weight. Product weight is one of the most effective parameters in grading operation. Egg weight is directly related to the smallness and coarseness of eggs. In egg grading, the largeness value is very important in marketing. This research aimed to design, fabricate and evaluate the egg weighing system based on its dielectric properties. Materials and Methods: To perform this research, the stages of work are divided into several sections including, design and construction of the hardware section, writing code for the software section to collect data, conducting nondestructive tests and data collection, analysis of obtained data using artificial intelligence, and giving the results of analysis for device calibration of the system as the software code. The large eggs as dielectric substances cause more increase in the capacity of the capacitive sensor. Furthermore, by derivation of a relation between capacity of capacitive sensor and egg weight, one can predict the weight of the sample. A prototype unit of weighing system was designed and fabricated. The designed unit was composed of a chassis, a voltage source, a sinusoidal signal generator, a voltage measurement unit, an AVR micro controller, a COM port, a capacitive sensor, and an LCD and a keyboard. Neural network technique was used for egg weight prediction. The designed net receives 16 voltage values at different frequencies as inputs and its output is the egg weight. In order to calibrate and evaluate the weighing unit, 150 fresh egg samples were provided on egg laying day from a local poultry farm. Experiments were divided into three groups. The experiments were carried out on egg-laying day, and the second and fourth day after laying. Results and Discussion: In this study, two networks were built and evaluated. In the first series, two-layer networks and in the second series, three-layer networks were developed. In the two-layer neural networks, the number of neurons in the hidden layer was changed from 2 to 10.According to the given results for two-layer networks, two layer networks with 10 neurons offer the best results (the highest R-value and minimum RMSE) and it can be chosen as the most effective two-layer network. Three-layer neural networks have been composed of two hidden layers. The number of neurons in the first hidden layer was 10 and in the second layer it was changed from 1 to 20. Between three-layer networks, the network with 7 neurons with the highest R-value and the lowest error is the most appropriate network. It is even more efficient than the two-layer network with 10 neurons. So, the most appropriate structure is 1-7-10-16 and it has been selected for calibration of the weighing device. To evaluate and assess the accuracy of the weighing machine, weights of 24 samples of fresh eggs were predicted and compared with the actual values obtained using a digital scale with the accuracy of 0.01 gr. The paired t-test has been used to compare the measured and predicted values and the Bland-Altman method has been used for charting the accordance between the measured and predicted values. Based on the findings, the difference between the measured and predicted values was observed up to 5.4 gr that is related to a very large sample. The mean absolute error is equal to 2.21 gr and the mean absolute percentage error is equal to 3.75 %. According to the findings, 95% of the actual and approximate matching range to compare the two weighing methods is between -5.3 gr and 3.36 gr. Thus, the dielectric technique may underestimate the egg weight up to 5.3 gr or it may overestimate it up to 3.36 gr more than the actual prediction. Conclusions: The best results were obtained with a 3 layers net having 10 and 7 neurons, respectively in the first and the second hidden layers with the highest R-value, 0.983 and the lowest error, 0.502. Therefore, this net was applied for egg weight prediction. To evaluate the device, the weights of 24 fresh eggs were estimated using the device and were compared with actual values and the maximum error was observed to be equal to 5.4 gr.
- Supplementary Content
7
- 10.1184/r1/8336783.v1
- Jun 27, 2019
- Figshare
Machine learning has become an important tool set for artificial intelligence and data science across many fields. A modern machine learning method can be often reduced to a mathematical optimization problem. Among algorithms to solve the optimization problem, gradient descent and its variants like stochastic gradient descent and momentum methods are the most popular ones. The optimization problem induced from classical machine learning methods is often a convex and smooth one, for which gradient descent is guaranteed to solve it efficiently. On the other hand,<br>modern machine learning methods, like deep neural networks, often require solving a non-smooth and non-convex problem. Theoretically, non-convex mathematical<br>optimization problems cannot be solved efficiently. However, in practice, gradient descent and its variants can find a global optimum efficiently. These competing facts<br>show that often there are special structures in the optimization problems that can make gradient descent succeed in practice. This thesis presents technical contributions to fill the gap between theory and<br>practice on the gradient descent algorithm. The outline of the thesis is as follows. In the first part, we consider applying gradient descent to minimize the empirical<br>risk of a neural network. We will show if a multi-layer neural network with smooth activation function is sufficiently wide, then randomly initialized gradient descent can efficiently find a global minimum of the empirical risk. We will also show the same result for the two-layer neural network with Rectified Linear Unit (ReLU) activation function. It is quite surprising that although the objective function of neural networks is non-convex, gradient descent can still find their global minimum. Lastly, we will study structural property of the trajectory induced by the gradient descent algorithm.<br> In the second part, we assume the label is generated from a two-layer teacher convolutional neural network and we consider using gradient descent to recover the teacher convolutional neural network. We will show that if the input distribution is Gaussian, then gradient descent can recovered a one-hidden-layer convolutional neural network in which both the convolutional weights and the output wights are unknown parameters to be recovered. We will also show that the Gaussian input assumption can be relaxed to a general structural assumption if we only need to recover a single convolutional filter. In the third part, we study conditions under which gradient descent fails. We<br>will show gradient descent can take exponential time to optimize a smooth function with the strict saddle point property for which the noise-injected gradient can optimize in polynomial time. While our focus is theoretical, whenever possible, we also present experiments that illustrate our theoretical findings.
- Research Article
70
- 10.1371/journal.pone.0154863
- May 4, 2016
- PLOS ONE
IntroductionConfocal laser endomicroscopy (CLE) is becoming a popular method for optical biopsy of digestive mucosa for both diagnostic and therapeutic procedures. Computer aided diagnosis of CLE images, using image processing and fractal analysis can be used to quantify the histological structures in the CLE generated images. The aim of this study is to develop an automatic diagnosis algorithm of colorectal cancer (CRC), based on fractal analysis and neural network modeling of the CLE-generated colon mucosa images.Materials and MethodsWe retrospectively analyzed a series of 1035 artifact-free endomicroscopy images, obtained during CLE examinations from normal mucosa (356 images) and tumor regions (679 images). The images were processed using a computer aided diagnosis (CAD) medical imaging system in order to obtain an automatic diagnosis. The CAD application includes image reading and processing functions, a module for fractal analysis, grey-level co-occurrence matrix (GLCM) computation module, and a feature identification module based on the Marching Squares and linear interpolation methods. A two-layer neural network was trained to automatically interpret the imaging data and diagnose the pathological samples based on the fractal dimension and the characteristic features of the biological tissues.ResultsNormal colon mucosa is characterized by regular polyhedral crypt structures whereas malignant colon mucosa is characterized by irregular and interrupted crypts, which can be diagnosed by CAD. For this purpose, seven geometric parameters were defined for each image: fractal dimension, lacunarity, contrast correlation, energy, homogeneity, and feature number. Of the seven parameters only contrast, homogeneity and feature number were significantly different between normal and cancer samples. Next, a two-layer feed forward neural network was used to train and automatically diagnose the malignant samples, based on the seven parameters tested. The neural network operations were cross-entropy with the results: training: 0.53, validation: 1.17, testing: 1.17, and percent error, resulting: training: 16.14, validation: 17.42, testing: 15.48. The diagnosis accuracy error was 15.5%.ConclusionsComputed aided diagnosis via fractal analysis of glandular structures can complement the traditional histological and minimally invasive imaging methods. A larger dataset from colorectal and other pathologies should be used to further validate the diagnostic power of the method.
- Research Article
8
- 10.1016/j.neunet.2010.09.006
- Sep 16, 2010
- Neural Networks
Delay-induced primary rhythmic behavior in a two-layer neural network
- Research Article
1
- 10.1007/s41884-024-00142-3
- Jul 28, 2024
- Information Geometry
It is well known that a model can generalize even when it completely interpolates the training data, which is known as the benign overfitting. Indeed, several work have theoretically revealed that the minimum-norm interpolator can exhibit the benign overfitting. On the other hand, deep learning models such as two-layer neural networks have been reported to outperform “shallow” learning models such as kernel methods under appropriate model sizes by adaptively learning the basis functions to the data. This mechanism is called feature learning, and it is known empirically to be beneficial even when the model size is large. However, it is generally difficult to show that benign overfitting occurs in learning models with feature learning especially for regression problems. In this study, we then analyze the predictive error of the estimator after one step feature learning in a two-layer linear neural network optimized by gradient descent methods and study the effect of feature learning on benign overfitting. The results show that feature learning reduces bias compared to a one-layer linear regression model without feature learning, especially when the eigenvalues of the covariance of input decay slowly. On the other hand, we clarify that the variance is hardly changed by feature learning. This differs significantly from the results for benign overfitting in the situation without feature learning and indicates the usefulness of feature learning.
- Research Article
8
- 10.1016/0003-2670(93)80093-z
- Jun 1, 1993
- Analytica Chimica Acta
Cross-peak classification in two-dimensional nuclear magnetic resonance spectra using a two-layer neural network
- Research Article
- 10.1299/kikaic.60.569
- Jan 1, 1994
- TRANSACTIONS OF THE JAPAN SOCIETY OF MECHANICAL ENGINEERS Series C
A two-layered neural network was applied to adaptive control of a servo mechanism. The two-layered neural network is simple and can be built in a structure corresponding to inverse dynamics of a controlled plant. A demand signal, previous output signals of a plant, and previous control input signals are fed into the network. The output of the network is control input and is fed into a controlled plant. Initial weights can be set using information on nominal plant parameters. The weights are updated by a back propagation strategy with a normalized learning rate which may specify the rate of convergence of learning. Through experiments using an electrohydraulic servo motor system, the validity of the neural network controller was examined. Adaptation function of the neural network was demonstrated.
- Research Article
29
- 10.1007/s00526-021-02156-6
- Feb 3, 2022
- Calculus of Variations and Partial Differential Equations
We study the natural function space for infinitely wide two-layer neural networks with ReLU activation (Barron space) and establish different representation formulae. In two cases, we describe the space explicitly up to isomorphism. Using a convenient representation, we study the pointwise properties of two-layer networks and show that functions whose singular set is fractal or curved (for example distance functions from smooth submanifolds) cannot be represented by infinitely wide two-layer networks with finite path-norm. We use this structure theorem to show that the only $$C^1$$ -diffeomorphisms which preserve Barron space are affine. Furthermore, we show that every Barron function can be decomposed as the sum of a bounded and a positively one-homogeneous function and that there exist Barron functions which decay rapidly at infinity and are globally Lebesgue-integrable. This result suggests that two-layer neural networks may be able to approximate a greater variety of functions than commonly believed.
- Conference Article
- 10.1109/ijcnn.2014.6889774
- Jul 1, 2014
Learning process is an important part in two-layer networks. It is imperative to search for an optimal learning rate to get a maximum error reduction in each learning step. Related literature has proposed various kinds of methods to find such an optimal learning rate in the past decades. In this paper, we proposed an improved dynamic optimal learning rate by adding an optimal ratio k. It is found that our improved dynamic optimal learning rate can generate a better result in learning processes. Meanwhile, we have proved the existence of the ratio kby giving it a proper math expression. Furthermore, we also applied the improved learning rate to solve inverse problem and compared the difference of the improved learning rate with the previous approach. It is observed that our proposed method performs better. Therefore, it can be concluded that our new method to search for dynamic optimal learning rate is valuable in the intelligence learning applications of neural networks, or it is effective in the aspect of tested problem at least.
- Research Article
3
- 10.1109/tit.2023.3274152
- Sep 1, 2023
- IEEE Transactions on Information Theory
LASSO regularization is a popular regression tool to enhance the prediction accuracy of statistical models by performing variable selection through the <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\ell _{1}$ </tex-math></inline-formula> penalty, initially formulated for the linear model and its variants. In this paper, the territory of LASSO is extended to two-layer ReLU neural networks, a fashionable and powerful nonlinear regression model. Specifically, given a neural network whose output <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$y$ </tex-math></inline-formula> depends only on a small subset of input <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\boldsymbol {x}$ </tex-math></inline-formula> , denoted by <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathcal {S}^{\star }$ </tex-math></inline-formula> , we prove that the LASSO estimator can stably reconstruct the neural network and identify <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\mathcal {S}^{\star }$ </tex-math></inline-formula> when the number of samples scales logarithmically with the input dimension. This challenging regime has been well understood for linear models while barely studied for neural networks. Our theory lies in an extended Restricted Isometry Property (RIP)-based analysis framework for two-layer ReLU neural networks, which may be of independent interest to other LASSO or neural network settings. Based on the result, we advocate a neural network-based variable selection method. Experiments on simulated and real-world datasets show promising performance of the variable selection approach compared with existing techniques.
- Research Article
13
- 10.1016/j.cnsns.2023.107279
- Apr 29, 2023
- Communications in Nonlinear Science and Numerical Simulation
Event-triggered synchronization of a two-layer heterogeneous neural network via hybrid control
- Conference Article
12
- 10.1109/ikt.2013.6620045
- May 1, 2013
Host Intrusion detection systems (HIDS) are increasingly emerging techniques for information security on host based applications. These systems should be designed to prevent unauthorized access of system resources and data. Many intelligent learning techniques are currently being applied to the large volumes of data for the construction of an efficient host intrusion detection system. This paper represents a hybrid approach for modeling HIDS combines anomaly, misuse detection, based on two-layer Genetic algorithm and neural network which uses simple data mining techniques to process the web application traffics. Two-layer Genetic algorithm and neural network are applied respectively as anomaly and misuse detection. Suspicious intrusions can be traced back to its original source. The proposed model is able to detect critical vulnerabilities based on Open Web Application Security Project (OWASP).
- Conference Article
3
- 10.1109/iacet.1995.527584
- May 22, 1995
This paper presents an adaptive neural net controller for controlling given plants which are unknown. In the neural net structure, a two-layered network is used to emulate the unknown plant dynamics, and another two-layer neural network, which is the inverse of the estimator, is used to generate the control action on-line. A modified Widrow-Hoff delta rule is adopted as a learning algorithm to minimize the error between the real plant response and the output of the estimator. An effective learning method which is based on sliding motions is provided to tune the control action to improve the system performance and convergence. The major advantage of the proposed approach is that the lengthy training of the controller might be eliminated. The effectiveness of the proposed approach is illustrated through simulations of controlling a unstable plant and normalized motor model with noise disturbances.
- Research Article
19
- 10.1103/physreve.56.3426
- Sep 1, 1997
- Physical Review E
An adaptive back-propagation algorithm parametrized by an inverse temperature $\ensuremath{\beta}$ is studied and compared with gradient descent (standard back-propagation) for on-line learning in two-layer neural networks with an arbitrary number of hidden units. Within a statistical mechanics framework, we analyze these learning algorithms in both the symmetric and the convergence phase for finite learning rates in the case of uncorrelated teachers of similar but arbitrary length $T$. These analyses show that adaptive back-propagation results generally in faster training by breaking the symmetry between hidden units more efficiently and by providing faster convergence to optimal generalization than gradient descent.
- Research Article
4
- 10.26516/1997-7670.2022.39.111
- Jan 1, 2022
- The Bulletin of Irkutsk State University. Series Mathematics
Previously, for each multilayer neural network of direct signal propagation (hereinafter, simply a neural network), finite commutative groupoids were introduced, which were called additive subnet groupoids. These groupoids are closely related to the subnets of the neural network over which they are built. A grupoid is a monoid if and only if it is built over a two-layer neural network. Earlier, endomorphisms and their properties were studied for these groupoids. Some endomorphisms were constructed, but an exhaustive element-by-element description was not received. It was shown that every finite monoid is isomorphic to some submonoid of the monoid of all endomorphisms of a suitable additive subnet groupoid for some suitable neural network. In this paper, we study endomorphisms of additive groupoids of subnets of twolayer neural networks. The main result of the work is an element-wise description of the monoid of all endomorphisms of additive monoids of subnets built over a two-layer neural network. The item-by-item description is obtained by constructing a general form of endomorphism. The general view of an endomorphism is parameterized by the endomorphisms of suitable booleans with respect to the union operation. Therefore, endomorphisms of these Booleans were studied in this work. In particular, the semirings of endomorphisms of these Booleans with respect to the union were studied. In addition, to describe the general form of the endomorphism of the additive monoid of subnets, homomorphisms of one Boalean into another (with respect to union) were used.