High-dimensional Input Research Articles

Despite many proposed algorithms to provide robustness to deep learning (DL) models, DL models remain susceptible to adversarial attacks. We hypothesize that the adversarial vulnerability of DL models stems from two factors. The first factor is data sparsity which is that in the high dimensional input data space, there exist large regions outside the support of the data distribution. The second factor is the existence of many redundant parameters in the DL models. Owing to these factors, different models are able to come up with different decision boundaries with comparably high prediction accuracy. The appearance of the decision boundaries in the space outside the support of the data distribution does not affect the prediction accuracy of the model. However, it makes an important difference in the adversarial robustness of the model. We hypothesize that the ideal decision boundary is as far as possible from the support of the data distribution. In this paper, we develop a training framework to observe if DL models are able to learn such a decision boundary spanning the space around the class distributions further from the data points themselves. Semi-supervised learning was deployed during training by leveraging unlabeled data generated in the space outside the support of the data distribution. We measured adversarial robustness of the models trained using this training framework against well-known adversarial attacks and by using robustness metrics. We found that models trained using our framework, as well as other regularization methods and adversarial training support our hypothesis of data sparsity and that models trained with these methods learn to have decision boundaries more similar to the aforementioned ideal decision boundary. We show that the unlabeled data generated by noise in our framework is almost as effective on adversarial robustness as unlabeled data sourced from existing datasets or generated by synthesis algorithms. The code for our training framework is available online.

Long short-term memory (LSTM) is a type of powerful deep neural network that has been widely used in many sequence analysis and modeling applications. However, the large model size problem of LSTM networks make their practical deployment still very challenging, especially for the video recognition tasks that require high-dimensional input data. Aiming to overcome this limitation and fully unlock the potentials of LSTM models, in this paper we propose to perform algorithm and hardware co-design towards high-performance energy-efficient LSTM networks. At algorithm level, we propose to develop <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">fully decomposed hierarchical Tucker (FDHT)</i> structure-based LSTM, namely FDHT-LSTM, which enjoys ultra-low model complexity while still achieving high accuracy. In order to fully reap such attractive algorithmic benefit, we further develop the corresponding customized hardware architecture to support the efficient execution of the proposed FDHT-LSTM model. With the delicate design of memory access scheme, the complicated matrix transformation can be efficiently supported by the underlying hardware without any access conflict in an on-the-fly way. Our evaluation results show that both the proposed ultra-compact FDHT-LSTM models and the corresponding hardware accelerator achieve very high performance. Compared with the state-of-the-art compressed LSTM models, FDHT-LSTM enjoys both order-of-magnitude reduction (more than <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$1000 \times$</tex-math></inline-formula> ) in model size and significant accuracy improvement (0.6% to 12.7%) across different video recognition datasets. Meanwhile, compared with the state-of-the-art tensor decomposed model-oriented hardware TIE, our proposed FDHT-LSTM architecture achieve <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$2.5\times$</tex-math></inline-formula> , <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$1.46\times$</tex-math></inline-formula> and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$2.41\times$</tex-math></inline-formula> increase in throughput, area efficiency and energy efficiency, respectively on LSTM-Youtube workload. For LSTM-UCF workload, our proposed design also outperforms TIE with <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$1.9\times$</tex-math></inline-formula> higher throughput, <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$1.83\times$</tex-math></inline-formula> higher energy efficiency and comparable area efficiency.

High-dimensional Input Research Articles

Related Topics

Articles published on High-dimensional Input

RockGPT: reconstructing three-dimensional digital rocks from single two-dimensional slice with deep learning

Data-driven models for train control dynamics in high-speed railways: LAG-LSTM for train trajectory prediction

On High-Dimensional Time-Variant Reliability Analysis with the Maximum Entropy Principle

Enhanced SCADA IDS Security by Using MSOM Hybrid Unsupervised Algorithm

Classification of steel using laser-induced breakdown spectroscopy combined with deep belief network.

Structural damage identification method based on vibration statistical indicators and support vector machine

PQROM: To optimize software defined network QoS-aware routing with proximal policy optimization

ENHANCED SCADA IDS SECURITY BY USING MSOM HYBRID UNSUPERVISED ALGORITHM

Varying Coefficient Models and Design Choice for Bayes Linear Emulation of Complex Computer Models with Limited Model Evaluations

Accurate Remaining Useful Life Prediction With Uncertainty Quantification: A Deep Learning and Nonstationary Gaussian Process Approach

Sparse-Coding-Based Autoencoder and Its Application for Cancer Survivability Prediction

A weight initialization method based on neural network with asymmetric activation function

Explaining adversarial vulnerability with a data sparsity hypothesis

Using BART to Perform Pareto Optimization and Quantify its Uncertainties

Machine learning classification of trajectories from molecular dynamics simulations of chromosome segregation.

Research of Text Classification Based on TF-IDF and CNN-LSTM

Algorithm and Hardware Co-Design of Energy-Efficient LSTM Networks for Video Recognition with Hierarchical Tucker Tensor Decomposition

Graph Embedding via High Dimensional Model Representation for Hyperspectral Images

Physics-informed Neural Networks-based Model Predictive Control for Multi-link Manipulators

A FULLY BAYESIAN GRADIENT-FREE SUPERVISED DIMENSION REDUCTION METHOD USING GAUSSIAN PROCESSES

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

High-dimensional Input Research Articles

Related Topics

Articles published on High-dimensional Input

RockGPT: reconstructing three-dimensional digital rocks from single two-dimensional slice with deep learning

Data-driven models for train control dynamics in high-speed railways: LAG-LSTM for train trajectory prediction

On High-Dimensional Time-Variant Reliability Analysis with the Maximum Entropy Principle

Enhanced SCADA IDS Security by Using MSOM Hybrid Unsupervised Algorithm

Classification of steel using laser-induced breakdown spectroscopy combined with deep belief network.

Structural damage identification method based on vibration statistical indicators and support vector machine

PQROM: To optimize software defined network QoS-aware routing with proximal policy optimization

ENHANCED SCADA IDS SECURITY BY USING MSOM HYBRID UNSUPERVISED ALGORITHM

Varying Coefficient Models and Design Choice for Bayes Linear Emulation of Complex Computer Models with Limited Model Evaluations

Accurate Remaining Useful Life Prediction With Uncertainty Quantification: A Deep Learning and Nonstationary Gaussian Process Approach

Sparse-Coding-Based Autoencoder and Its Application for Cancer Survivability Prediction

A weight initialization method based on neural network with asymmetric activation function

Explaining adversarial vulnerability with a data sparsity hypothesis

Using BART to Perform Pareto Optimization and Quantify its Uncertainties

Machine learning classification of trajectories from molecular dynamics simulations of chromosome segregation.

Research of Text Classification Based on TF-IDF and CNN-LSTM

Algorithm and Hardware Co-Design of Energy-Efficient LSTM Networks for Video Recognition with Hierarchical Tucker Tensor Decomposition

Graph Embedding via High Dimensional Model Representation for Hyperspectral Images

Physics-informed Neural Networks-based Model Predictive Control for Multi-link Manipulators

A FULLY BAYESIAN GRADIENT-FREE SUPERVISED DIMENSION REDUCTION METHOD USING GAUSSIAN PROCESSES