The mathematics of adversarial attacks in AI – why deep learning is unstable despite the existence of stable neural networks
Abstract The unprecedented success of deep learning (DL) makes it unchallenged when it comes to classification problems. However, it is well established that the current DL methodology produces universally unstable neural networks (NNs). The instability problem has caused a substantial research effort – with a vast literature on so-called adversarial attacks – yet there has been no solution to the problem. Our paper addresses why there has been no solution to the problem, as we prove the following: any training procedure based on training rectified linear unit (ReLU) neural networks for classification problems with a fixed architecture will yield neural networks that are either inaccurate or unstable (if accurate) – despite the provable existence of both accurate and stable neural networks for the same classification problems. The key is that the stable and accurate neural networks must have variable dimensions depending on the input, in particular, variable dimensions is a necessary condition for stability. Our result points towards the paradox that accurate and stable neural networks exist; however, modern algorithms do not compute them. This yields the question: if the existence of neural networks with desirable properties can be proven, can one also find algorithms that compute them? There are cases in mathematics where provable existence implies computability, but will this be the case for neural networks? The contrary is true, as we demonstrate how neural networks can provably exist as approximate minimisers to standard optimisation problems with standard cost functions; however, no randomised algorithm can compute them with probability better than $1/2$ .
- Research Article
72
- 10.1016/j.neucom.2015.07.086
- Aug 10, 2015
- Neurocomputing
Development of a Self-Regulating Evolving Spiking Neural Network for classification problem
- Research Article
3
- 10.1016/j.eswa.2024.123905
- Apr 5, 2024
- Expert Systems With Applications
Demonstrating a new evaluation method on ReLU based Neural Networks for classification problems
- Research Article
6
- 10.1016/j.neunet.2021.05.007
- May 12, 2021
- Neural Networks
Experimental stability analysis of neural networks in classification problems with confidence sets for persistence diagrams
- Book Chapter
1
- 10.1007/978-3-642-45111-9_12
- Jan 1, 2013
The aim of this paper is to analyze the potentialities of Bidirectional Recurrent Neural Networks in classification problems. Different functions are proposed to merge the network outputs into one single classification decision. In order to analyze when these networks could be useful; artificial datasets were constructed to compare their performance against well-known classification methods in different situations, such as complex and simple decision boundaries, and related and independent features. The advantage of this neural network in classification problems with complicated decision boundaries and feature relations was proved statistically. Finally, better results using this network topology in the prediction of HIV drug resistance were also obtained.
- Single Book
4
- 10.1007/bfb0098154
- Jan 1, 1999
Foundations and Tools for Neural Modeling
- Book Chapter
- 10.1007/978-3-319-35162-9_5
- Jan 1, 2016
The purpose of this chapter is consideration and analysis of fuzzy neural networks in classification problems, which have a wide use in industry, economy, sociology, medicine etc. In the Sect. 5.2 a basic fuzzy neural network for classification—NEFClass is considered, the learning algorithms of rule base and MF of fuzzy sets are presented and investigated. Advantages and lacks of the system NEFClass are analyzed and its modification FNN NEFClass M, free of lacks of the system NEFClass is described in the Sect. 5.3. The results of numerous comparative experimental researches of the basic and modified system NEFClass are described in Sect. 5.4. The important in a practical sense task of recognition of objects on electro-optical images (EOI) is considered and its solution with application of FNN NEFClass is presented in the Sect. 5.5. The comparative analysis of different learning algorithms of FNN NEFClass at the task of recognition of EOI objects in the presence of noise is carried out. Problem of hand-written mathematical expressions recognition is considered in the Sect. 5.6 and its solution with application of FNN NEFClass is presented.
- Research Article
50
- 10.1016/j.neunet.2020.06.024
- Jul 3, 2020
- Neural Networks
Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness
- Book Chapter
- 10.5772/intechopen.1002652
- Nov 23, 2023
Traditional neural network training is usually based on the maximum likelihood to obtain the appropriate network parameters including weights and biases given the training data. However, if the available data are finite and noisy, the maximum likelihood-based network training can cause the neural network after being trained to overfit the noisy data. This problem has been overcome by using the Bayesian inference applied to the neural network training in various applications. The Bayesian inference can allow values of regularization parameters to be found using only the training data. In addition, the Bayesian approach also allows different models (e.g., neural networks with different numbers of hidden units to be compared using only the training data). Neural networks trained with Bayesian inference can be also known as Bayesian neural networks (BNNs). This chapter focuses on BNNs for classification problems with considerations of model complexity of BNNs conveniently handled by a method known as the evidence framework.
- Research Article
90
- 10.1073/pnas.2107151119
- Mar 16, 2022
- Proceedings of the National Academy of Sciences
Deep learning (DL) has had unprecedented success and is now entering scientific computing with full force. However, current DL methods typically suffer from instability, even when universal approximation properties guarantee the existence of stable neural networks (NNs). We address this paradox by demonstrating basic well-conditioned problems in scientific computing where one can prove the existence of NNs with great approximation qualities; however, there does not exist any algorithm, even randomized, that can train (or compute) such a NN. For any positive integers K>2 and L, there are cases where simultaneously 1) no randomized training algorithm can compute a NN correct to K digits with probability greater than 1/2; 2) there exists a deterministic training algorithm that computes a NN with K –1 correct digits, but any such (even randomized) algorithm needs arbitrarily many training data; and 3) there exists a deterministic training algorithm that computes a NN with K –2 correct digits using no more than L training samples. These results imply a classification theory describing conditions under which (stable) NNs with a given accuracy can be computed by an algorithm. We begin this theory by establishing sufficient conditions for the existence of algorithms that compute stable NNs in inverse problems. We introduce fast iterative restarted networks (FIRENETs), which we both prove and numerically verify are stable. Moreover, we prove that only O(|log (ϵ)|) layers are needed for an ϵ-accurate solution to the inverse problem.
- Book Chapter
11
- 10.1007/bfb0032534
- Jan 1, 1997
Limited precision neural networks are better suited for hardware implementations. Several researchers have proposed various algorithms which are able to train neural networks with limited precision weights. Also it has been suggested that the limits introduced by the limited precision weights can be compensated by an increased number of layers. This paper shows that, from a theoretical point of view, neural networks with integer weights in the range [-p,p] can solve classification problems for which the minimum euclidian distance in-between two patterns from opposite classes is 1/p. This result can be used in an information theory context to calculate a bound on the number of bits necessary for solving a problem. It is shown that the number of bits is limited by m*n*log(2pD) where m is the number of patterns, n is the dimensionality of the space, p is the weight range and D is the radius of a sphere including all patterns.Keywordsneural networksentropyclassification problemsinteger weightsnumber of bits
- Book Chapter
- 10.1007/3-540-64575-6_58
- Jan 1, 1998
This paper presents some complexity results for the specific case of a VLSI friendly neural network used in classification problems. A VLSI-friendly neural network is a neural network using exclusively integer weights in a narrow interval. The results presented here give updated worst-case lower bounds for the number of weights used by the network. It is shown that the number of weights can be lower bounded by an expression calculated using parameters depending exclusively on the problem (the minimum distance between patterns of opposite classes, the maximum distance between any patterns, the number of patterns and the number of dimensions). The theoretical approach is used to calculate the necessary weight range, a lower bound for the number of bits necessary to solve the problem in the worst case and the necessary number of weights for several problems. Then, a constructive algorithm using limited precision integer weights is used to construct and train neural networks for the same problems. The experimental values obtained are then compared with the theoretical values calculated. The comparison shows that the necessary weight precision can be estimated accurately using the given approach. However, the estimated numbers of weights are in general larger than the values obtained experimentally.
- Conference Article
4
- 10.1109/ijcnn.1999.831551
- Jul 10, 1999
This paper analyzes some aspects of the computational power of neural networks (NN) using integer weights in a very restricted range. Using limited range integer values opens the road for efficient VLSI implementations because: 1) a limited range for the weights can be translated into reduced storage requirements, and 2) integer computation can be implemented in a more efficient way than the floating point one. The paper shows that a neural network using integer weights in the range [-p,p] (where p is a small integer value) can classify correctly any set of patterns included in a hypercube of unit side length centered around the origin of R/sub n/, n/spl ges/2, for which the minimum Euclidean distance between two patterns of opposite classes is d/sub min//spl ges//spl radic/(n-1)/2p.
- Research Article
40
- 10.1016/j.cor.2010.05.001
- May 6, 2010
- Computers & Operations Research
A hybrid radial basis function and data envelopment analysis neural network for classification
- Research Article
149
- 10.1109/tsmcb.2005.847740
- Oct 1, 2005
- IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics)
There are numerous combinations of neural networks (NNs) and evolutionary algorithms (EAs) used in classification problems. EAs have been used to train the networks, design their architecture, and select feature subsets. However, most of these combinations have been tested on only a few data sets and many comparisons are done inappropriately measuring the performance on training data or without using proper statistical tests to support the conclusions. This paper presents an empirical evaluation of eight combinations of EAs and NNs on 15 public-domain and artificial data sets. Our objective is to identify the methods that consistently produce accurate classifiers that generalize well. In most cases, the combinations of EAs and NNs perform equally well on the data sets we tried and were not more accurate than hand-designed neural networks trained with simple backpropagation.
- Research Article
- 10.15587/2706-5448.2022.252695
- Feb 11, 2022
- Technology audit and production reserves
The object of research is the ability to combine a previously trained model of a deep neural network of direct propagation with user data when used in problems of determining the class of one object in the image. That is, the processes of transfer learning in convolutional neural networks in classification problems are considered. The conducted researches are based on application of a method of comparison of theoretical and practical results received at training of convolutional neural networks. The main objective of this research is to conduct two different learning processes. Traditional training during which in each epoch of training there is an adjustment of values of all weights of each layer of a network. After that there is a process of training of a neural network on a sample of the data presented by images. The second process is learning using transfer learning methods, when initializing a pre-trained network, the weights of all its layers are «frozen» except for the last fully connected layer. This layer will be replaced by a new one with the number of outputs, which should be equal to the number of classes in the sample. After that, to initialize its parameters by the random values distributed according to the normal law. Then conduct training of such convolutional neural network on the set sample. When the training was conducted, the results were compared. In conclusion, learning from convolutional neural networks using transfer learning techniques can be applied to a variety of classification tasks, ranging from numbers to space objects (stars and quasars). The amount of computer resources spent on research is also quite important. Because not all model of a convolutional neural network can be fully taught without powerful computer systems and a large number of images in the training sample.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.