Training Process Of Deep Neural Networks Research Articles

Deep neural networks (DNNs) have achieved unprecedented success across many scientific and engineering fields in the last decades. Despite its empirical success, unfortunately, recent studies have shown that there are various failure modes and blindspots in DNN models which may result in unexpected serious failures and potential harms, e.g. the existence of adversarial examples and small perturbations. This is not acceptable especially for safety critical and high stakes applications in the real-world, including healthcare, self-driving cars, aircraft control systems, hiring and malware detection protocols. Moreover, it has been challenging to understand why and when DNNs will fail due to their complicated structures and black-box behaviors. Lacking interpretability is one critical issue that may seriously hinder the deployment of DNNs in high-stake applications, which need interpretability to trust the prediction, to understand potential failures, and to be able to mitigate harms and eliminate biases in the model. To make DNNs trustworthy and reliable for deployment, it is necessary and urgent to develop methods and tools that can (i) quantify and improve their robustness against adversarial and natural perturbations, and (ii) understand their underlying behaviors and further correct errors to prevent injuries and damages. These are the important first steps to enable Trustworthy AI and Trustworthy Machine Learning. In this talk, I will survey a series of research efforts in my lab contributed to tackling the grand challenges in (i) and (ii). In the first part of my talk, I will overview our research effort in Robust Machine Learning since 2017, where we have proposed the first attack-agnostic robustness evaluation metric, the first efficient robustness certification algorithms for various types of perturbations, and efficient robust learning algorithms across supervised learning to deep reinforcement learning. In the second part of my talk, I will survey a series of exciting results in my lab on accelerating interpretable machine learning and explainable AI. Specifically, I will show how we could bring interpretability into deep learning by leveraging recent advances in multi-modal models. I'll present recent works in our group on automatically dissecting neural networks with open vocabulary concepts, designing interpretable neural networks without concept labels, and briefly overview our recent efforts on demystifying black-box DNN training process, automated neuron explanations for Large Language Models and the first robustness evaluation of a family of neuron-level interpretation techniques.

Read full abstract

This paper presents a comprehensive empirical investigation into the interactions between various randomization techniques in Deep Neural Networks (DNNs) and their impact on learning performance. It is well-established that injecting randomness into the training process of DNNs, through various approaches, at different stages, is often beneficial for reducing overfitting and improving generalization. Nonetheless, the interactions between randomness techniques such as weight noise, dropout, and many others remain poorly understood. Consequently, it is challenging to determine which methods can be effectively combined to optimize DNN performance. To address this issue, we categorize the existing randomness techniques into four key types: injection of noise/randomness at the data, model structure, optimization or learning stage. We use this classification to identify gaps in the current coverage of potential mechanisms for the introduction of randomness, leading to proposing two new techniques: adding noise to the loss function and random masking of the gradient updates.In our empirical study, we employ a Particle Swarm Optimizer (PSO) for hyperparameter optimization (HPO) to explore the space of possible configurations to determine where and how much randomness should be injected to maximize DNN performance. We assess the impact of various types and levels of randomness for DNN architectures across standard computer vision benchmarks: MNIST, FASHION-MNIST, CIFAR10, and CIFAR100. Across more than 30000 evaluated configurations, we perform a detailed examination of the interactions between randomness techniques and their combined impact on DNN performance. Our findings reveal that randomness through data augmentation and in weight initialization are the main contributors to performance improvement. Additionally, correlation analysis demonstrates that different optimizers, such as Adam and Gradient Descent with Momentum, prefer distinct types of randomization during the training process. A GitHub repository with the complete implementation and generated dataset is available.2

Read full abstract

Training Process Of Deep Neural Networks Research Articles

Related Topics

Articles published on Training Process Of Deep Neural Networks

Invisible backdoor attack with attention and steganography

Towards Trustworthy Deep Learning

Rolling the dice for better deep learning performance: A study of randomness techniques in deep neural networks

Get Your Foes Fooled: Proximal Gradient Split Learning for Defense Against Model Inversion Attacks on IoMT Data

Attack Classification of Imbalanced Intrusion Data for IoT Network Using Ensemble-Learning-Based Deep Neural Network

LHDNN: Maintaining High Precision and Low Latency Inference of Deep Neural Networks on Encrypted Data

EdgeMesh: A hybrid distributed training mechanism for heterogeneous edge devices

A Non-Idealities Aware Software–Hardware Co-Design Framework for Edge-AI Deep Neural Network Implemented on Memristive Crossbar

A novel adaptive fault diagnosis algorithm for multi-machine equipment: application in bearing and diesel engine

BATUDE: Budget-Aware Neural Network Compression Based on Tucker Decomposition

Backdoor Attacks on Image Classification Models in Deep Neural Networks

An Information Theoretic Interpretation to Deep Neural Networks.

Visual vs internal attention mechanisms in deep neural networks for image classification and object detection

Learning to optimize for resource allocation in LTE-U networks

Theory of the Frequency Principle for General Deep Neural Networks

A Unified Framework for Cross-Domain and Cross-System Recommendations

A rolling bearing fault diagnosis method based on fastDTW and an AGBDBN

A survey of swarm and evolutionary computing approaches for deep learning

Accelerating Deep Learning with a Parallel Mechanism Using CPU + MIC

Wind Turbine Gearbox Failure Identification With Deep Neural Networks

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Training Process Of Deep Neural Networks Research Articles

Related Topics

Articles published on Training Process Of Deep Neural Networks

Invisible backdoor attack with attention and steganography

Towards Trustworthy Deep Learning

Rolling the dice for better deep learning performance: A study of randomness techniques in deep neural networks

Get Your Foes Fooled: Proximal Gradient Split Learning for Defense Against Model Inversion Attacks on IoMT Data

Attack Classification of Imbalanced Intrusion Data for IoT Network Using Ensemble-Learning-Based Deep Neural Network

LHDNN: Maintaining High Precision and Low Latency Inference of Deep Neural Networks on Encrypted Data

EdgeMesh: A hybrid distributed training mechanism for heterogeneous edge devices

A Non-Idealities Aware Software–Hardware Co-Design Framework for Edge-AI Deep Neural Network Implemented on Memristive Crossbar

A novel adaptive fault diagnosis algorithm for multi-machine equipment: application in bearing and diesel engine

BATUDE: Budget-Aware Neural Network Compression Based on Tucker Decomposition

Backdoor Attacks on Image Classification Models in Deep Neural Networks

An Information Theoretic Interpretation to Deep Neural Networks.

Visual vs internal attention mechanisms in deep neural networks for image classification and object detection

Learning to optimize for resource allocation in LTE-U networks

Theory of the Frequency Principle for General Deep Neural Networks

A Unified Framework for Cross-Domain and Cross-System Recommendations

A rolling bearing fault diagnosis method based on fastDTW and an AGBDBN

A survey of swarm and evolutionary computing approaches for deep learning

Accelerating Deep Learning with a Parallel Mechanism Using CPU + MIC

Wind Turbine Gearbox Failure Identification With Deep Neural Networks