Principled deep neural network training through linear programming

Daniel Bienstock,Gonzalo Muñoz,Sebastian Pokutta

doi:10.1016/j.disopt.2023.100795

Abstract

Deep learning has received much attention lately due to the impressive empirical performance achieved by training algorithms. Consequently, a need for a better theoretical understanding of these problems has become more evident and multiple works in recent years have focused on this task. In this work, using a unified framework, we show that there exists a polyhedron that simultaneously encodes, in its facial structure, all possible deep neural network training problems that can arise from a given architecture, activation functions, loss function, and sample size. Notably, the size of the polyhedral representation depends only linearly on the sample size, and a better dependency on several other network parameters is unlikely. Using this general result, we compute the size of the polyhedral encoding for commonly used neural network architectures. Our results provide a new perspective on training problems through the lens of polyhedral theory and reveal strong structure arising from these problems.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Principled deep neural network training through linear programming

Abstract

Talk to us

Similar Papers

More From: Discrete Optimization

Lead the way for us

Journal: Discrete Optimization	Publication Date: Aug 1, 2023
Citations: 6

Similar Papers

Neuroevolution in Deep Neural Networks: Current Trends and Future Challenges
Edgar Galvan ... Peter Mooney
IEEE Transactions on Artificial Intelligence | VOL. 2
Edgar Galvan, et. al.Edgar Galvan ... Peter Mooney
04 May 2021
IEEE Transactions on Artificial Intelligence | VOL. 2

A Guessing Entropy-Based Framework for Deep Learning-Assisted Side-Channel Analysis
Ziyue Zhang ... Yunsi Fei
IEEE Transactions on Information Forensics and Security | VOL. 18
Ziyue Zhang, et. al.Ziyue Zhang ... Yunsi Fei
01 Jan 2023
IEEE Transactions on Information Forensics and Security | VOL. 18

On Neural Network Activation Functions and Optimizers in Relation to Polynomial Regression
John Pomerat ... Aviv Segev
-
John Pomerat, et. al.John Pomerat ... Aviv Segev
01 Dec 2019
01 Dec 2019

AccDP: Accelerated Data-Parallel Distributed DNN Training for Modern GPU-Based HPC Clusters
Nawras Alnaasan ... Hari Subramoni
-
Nawras Alnaasan, et. al.Nawras Alnaasan ... Hari Subramoni
01 Dec 2022
01 Dec 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Principled deep neural network training through linear programming

Abstract

Talk to us

Similar Papers

More From: Discrete Optimization