The mathematics of adversarial attacks in AI – why deep learning is unstable despite the existence of stable neural networks

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Abstract The unprecedented success of deep learning (DL) makes it unchallenged when it comes to classification problems. However, it is well established that the current DL methodology produces universally unstable neural networks (NNs). The instability problem has caused a substantial research effort – with a vast literature on so-called adversarial attacks – yet there has been no solution to the problem. Our paper addresses why there has been no solution to the problem, as we prove the following: any training procedure based on training rectified linear unit (ReLU) neural networks for classification problems with a fixed architecture will yield neural networks that are either inaccurate or unstable (if accurate) – despite the provable existence of both accurate and stable neural networks for the same classification problems. The key is that the stable and accurate neural networks must have variable dimensions depending on the input, in particular, variable dimensions is a necessary condition for stability. Our result points towards the paradox that accurate and stable neural networks exist; however, modern algorithms do not compute them. This yields the question: if the existence of neural networks with desirable properties can be proven, can one also find algorithms that compute them? There are cases in mathematics where provable existence implies computability, but will this be the case for neural networks? The contrary is true, as we demonstrate how neural networks can provably exist as approximate minimisers to standard optimisation problems with standard cost functions; however, no randomised algorithm can compute them with probability better than $1/2$ .

Similar Papers
  • Research Article
  • Cite Count Icon 72
  • 10.1016/j.neucom.2015.07.086
Development of a Self-Regulating Evolving Spiking Neural Network for classification problem
  • Aug 10, 2015
  • Neurocomputing
  • S Dora + 3 more

Development of a Self-Regulating Evolving Spiking Neural Network for classification problem

  • Research Article
  • Cite Count Icon 3
  • 10.1016/j.eswa.2024.123905
Demonstrating a new evaluation method on ReLU based Neural Networks for classification problems
  • Apr 5, 2024
  • Expert Systems With Applications
  • Dávid Tollner + 3 more

Demonstrating a new evaluation method on ReLU based Neural Networks for classification problems

  • Research Article
  • Cite Count Icon 6
  • 10.1016/j.neunet.2021.05.007
Experimental stability analysis of neural networks in classification problems with confidence sets for persistence diagrams
  • May 12, 2021
  • Neural Networks
  • Naoki Akai + 2 more

Experimental stability analysis of neural networks in classification problems with confidence sets for persistence diagrams

  • Book Chapter
  • Cite Count Icon 1
  • 10.1007/978-3-642-45111-9_12
Bidirectional Recurrent Neural Networks for Biological Sequences Prediction
  • Jan 1, 2013
  • Isis Bonet + 2 more

The aim of this paper is to analyze the potentialities of Bidirectional Recurrent Neural Networks in classification problems. Different functions are proposed to merge the network outputs into one single classification decision. In order to analyze when these networks could be useful; artificial datasets were constructed to compare their performance against well-known classification methods in different situations, such as complex and simple decision boundaries, and related and independent features. The advantage of this neural network in classification problems with complicated decision boundaries and feature relations was proved statistically. Finally, better results using this network topology in the prediction of HIV drug resistance were also obtained.

  • Single Book
  • Cite Count Icon 4
  • 10.1007/bfb0098154
Foundations and Tools for Neural Modeling
  • Jan 1, 1999
  • Juan V Sánchez-Andrés

Foundations and Tools for Neural Modeling

  • Book Chapter
  • 10.1007/978-3-319-35162-9_5
Fuzzy Neural Networks in Classification Problems
  • Jan 1, 2016
  • Mikhail Z Zgurovsky + 1 more

The purpose of this chapter is consideration and analysis of fuzzy neural networks in classification problems, which have a wide use in industry, economy, sociology, medicine etc. In the Sect. 5.2 a basic fuzzy neural network for classification—NEFClass is considered, the learning algorithms of rule base and MF of fuzzy sets are presented and investigated. Advantages and lacks of the system NEFClass are analyzed and its modification FNN NEFClass M, free of lacks of the system NEFClass is described in the Sect. 5.3. The results of numerous comparative experimental researches of the basic and modified system NEFClass are described in Sect. 5.4. The important in a practical sense task of recognition of objects on electro-optical images (EOI) is considered and its solution with application of FNN NEFClass is presented in the Sect. 5.5. The comparative analysis of different learning algorithms of FNN NEFClass at the task of recognition of EOI objects in the presence of noise is carried out. Problem of hand-written mathematical expressions recognition is considered in the Sect. 5.6 and its solution with application of FNN NEFClass is presented.

  • Research Article
  • Cite Count Icon 50
  • 10.1016/j.neunet.2020.06.024
Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness
  • Jul 3, 2020
  • Neural Networks
  • Pengzhan Jin + 3 more

Quantifying the generalization error in deep learning in terms of data distribution and neural network smoothness

  • PDF Download Icon
  • Book Chapter
  • 10.5772/intechopen.1002652
Bayesian Inference for Regularization and Model Complexity Control of Artificial Neural Networks in Classification Problems
  • Nov 23, 2023
  • Son T Nguyen + 4 more

Traditional neural network training is usually based on the maximum likelihood to obtain the appropriate network parameters including weights and biases given the training data. However, if the available data are finite and noisy, the maximum likelihood-based network training can cause the neural network after being trained to overfit the noisy data. This problem has been overcome by using the Bayesian inference applied to the neural network training in various applications. The Bayesian inference can allow values of regularization parameters to be found using only the training data. In addition, the Bayesian approach also allows different models (e.g., neural networks with different numbers of hidden units to be compared using only the training data). Neural networks trained with Bayesian inference can be also known as Bayesian neural networks (BNNs). This chapter focuses on BNNs for classification problems with considerations of model complexity of BNNs conveniently handled by a method known as the evidence framework.

  • Research Article
  • Cite Count Icon 90
  • 10.1073/pnas.2107151119
The difficulty of computing stable and accurate neural networks: On the barriers of deep learning and Smale’s 18th problem
  • Mar 16, 2022
  • Proceedings of the National Academy of Sciences
  • Matthew J Colbrook + 2 more

Deep learning (DL) has had unprecedented success and is now entering scientific computing with full force. However, current DL methods typically suffer from instability, even when universal approximation properties guarantee the existence of stable neural networks (NNs). We address this paradox by demonstrating basic well-conditioned problems in scientific computing where one can prove the existence of NNs with great approximation qualities; however, there does not exist any algorithm, even randomized, that can train (or compute) such a NN. For any positive integers K>2 and L, there are cases where simultaneously 1) no randomized training algorithm can compute a NN correct to K digits with probability greater than 1/2; 2) there exists a deterministic training algorithm that computes a NN with K –1 correct digits, but any such (even randomized) algorithm needs arbitrarily many training data; and 3) there exists a deterministic training algorithm that computes a NN with K –2 correct digits using no more than L training samples. These results imply a classification theory describing conditions under which (stable) NNs with a given accuracy can be computed by an algorithm. We begin this theory by establishing sufficient conditions for the existence of algorithms that compute stable NNs in inverse problems. We introduce fast iterative restarted networks (FIRENETs), which we both prove and numerically verify are stable. Moreover, we prove that only O(|log (ϵ)|) layers are needed for an ϵ-accurate solution to the inverse problem.

  • Book Chapter
  • Cite Count Icon 11
  • 10.1007/bfb0032534
On the possibilities of the limited precision weights neural networks in classification problems
  • Jan 1, 1997
  • Sorin Draghici + 1 more

Limited precision neural networks are better suited for hardware implementations. Several researchers have proposed various algorithms which are able to train neural networks with limited precision weights. Also it has been suggested that the limits introduced by the limited precision weights can be compensated by an increased number of layers. This paper shows that, from a theoretical point of view, neural networks with integer weights in the range [-p,p] can solve classification problems for which the minimum euclidian distance in-between two patterns from opposite classes is 1/p. This result can be used in an information theory context to calculate a bound on the number of bits necessary for solving a problem. It is shown that the number of bits is limited by m*n*log(2pD) where m is the number of patterns, n is the dimensionality of the space, p is the weight range and D is the radius of a sphere including all patterns.Keywordsneural networksentropyclassification problemsinteger weightsnumber of bits

  • Book Chapter
  • 10.1007/3-540-64575-6_58
On the complexity of VLSI-friendly neural networks for classification problems
  • Jan 1, 1998
  • Sorin Draghici

This paper presents some complexity results for the specific case of a VLSI friendly neural network used in classification problems. A VLSI-friendly neural network is a neural network using exclusively integer weights in a narrow interval. The results presented here give updated worst-case lower bounds for the number of weights used by the network. It is shown that the number of weights can be lower bounded by an expression calculated using parameters depending exclusively on the problem (the minimum distance between patterns of opposite classes, the maximum distance between any patterns, the number of patterns and the number of dimensions). The theoretical approach is used to calculate the necessary weight range, a lower bound for the number of bits necessary to solve the problem in the worst case and the necessary number of weights for several problems. Then, a constructive algorithm using limited precision integer weights is used to construct and train neural networks for the same problems. The experimental values obtained are then compared with the theoretical values calculated. The comparison shows that the necessary weight precision can be estimated accurately using the given approach. However, the estimated numbers of weights are in general larger than the values obtained experimentally.

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/ijcnn.1999.831551
Some new results on the capabilities of integer weights neural networks in classification problems
  • Jul 10, 1999
  • S Draghici

This paper analyzes some aspects of the computational power of neural networks (NN) using integer weights in a very restricted range. Using limited range integer values opens the road for efficient VLSI implementations because: 1) a limited range for the weights can be translated into reduced storage requirements, and 2) integer computation can be implemented in a more efficient way than the floating point one. The paper shows that a neural network using integer weights in the range [-p,p] (where p is a small integer value) can classify correctly any set of patterns included in a hypercube of unit side length centered around the origin of R/sub n/, n/spl ges/2, for which the minimum Euclidean distance between two patterns of opposite classes is d/sub min//spl ges//spl radic/(n-1)/2p.

  • Research Article
  • Cite Count Icon 40
  • 10.1016/j.cor.2010.05.001
A hybrid radial basis function and data envelopment analysis neural network for classification
  • May 6, 2010
  • Computers & Operations Research
  • Parag C Pendharkar

A hybrid radial basis function and data envelopment analysis neural network for classification

  • Research Article
  • Cite Count Icon 149
  • 10.1109/tsmcb.2005.847740
An Empirical Comparison of Combinations of Evolutionary Algorithms and Neural Networks for Classification Problems
  • Oct 1, 2005
  • IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics)
  • E Cantu-Paz + 1 more

There are numerous combinations of neural networks (NNs) and evolutionary algorithms (EAs) used in classification problems. EAs have been used to train the networks, design their architecture, and select feature subsets. However, most of these combinations have been tested on only a few data sets and many comparisons are done inappropriately measuring the performance on training data or without using proper statistical tests to support the conclusions. This paper presents an empirical evaluation of eight combinations of EAs and NNs on 15 public-domain and artificial data sets. Our objective is to identify the methods that consistently produce accurate classifiers that generalize well. In most cases, the combinations of EAs and NNs perform equally well on the data sets we tried and were not more accurate than hand-designed neural networks trained with simple backpropagation.

  • PDF Download Icon
  • Research Article
  • 10.15587/2706-5448.2022.252695
Comparative characteristics of the ability of convolutional neural networks to the concept of transfer learning
  • Feb 11, 2022
  • Technology audit and production reserves
  • Vladimir Khotsyanovsky

The object of research is the ability to combine a previously trained model of a deep neural network of direct propagation with user data when used in problems of determining the class of one object in the image. That is, the processes of transfer learning in convolutional neural networks in classification problems are considered. The conducted researches are based on application of a method of comparison of theoretical and practical results received at training of convolutional neural networks. The main objective of this research is to conduct two different learning processes. Traditional training during which in each epoch of training there is an adjustment of values of all weights of each layer of a network. After that there is a process of training of a neural network on a sample of the data presented by images. The second process is learning using transfer learning methods, when initializing a pre-trained network, the weights of all its layers are «frozen» except for the last fully connected layer. This layer will be replaced by a new one with the number of outputs, which should be equal to the number of classes in the sample. After that, to initialize its parameters by the random values distributed according to the normal law. Then conduct training of such convolutional neural network on the set sample. When the training was conducted, the results were compared. In conclusion, learning from convolutional neural networks using transfer learning techniques can be applied to a variety of classification tasks, ranging from numbers to space objects (stars and quasars). The amount of computer resources spent on research is also quite important. Because not all model of a convolutional neural network can be fully taught without powerful computer systems and a large number of images in the training sample.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.