MixTrain: accelerating DNN training via input mixing.

Sarada Krithivasan,Sanchari Sen,Swagath Venkataramani,Anand Raghunathan

doi:10.3389/frai.2024.1387936

Abstract

Training Deep Neural Networks (DNNs) places immense compute requirements on the underlying hardware platforms, expending large amounts of time and energy. An important factor contributing to the long training times is the increasing dataset complexity required to reach state-of-the-art performance in real-world applications. To address this challenge, we explore the use of input mixing, where multiple inputs are combined into a single composite input with an associated composite label for training. The goal is for training on the mixed input to achieve a similar effect as training separately on each the constituent inputs that it represents. This results in a lower number of inputs (or mini-batches) to be processed in each epoch, proportionally reducing training time. We find that naive input mixing leads to a considerable drop in learning performance and model accuracy due to interference between the forward/backward propagation of the mixed inputs. We propose two strategies to address this challenge and realize training speedups from input mixing with minimal impact on accuracy. First, we reduce the impact of inter-input interference by exploiting the spatial separation between the features of the constituent inputs in the network's intermediate representations. We also adaptively vary the mixing ratio of constituent inputs based on their loss in previous epochs. Second, we propose heuristics to automatically identify the subset of the training dataset that is subject to mixing in each epoch. Across ResNets of varying depth, MobileNetV2 and two Vision Transformer networks, we obtain upto 1.6 × and 1.8 × speedups in training for the ImageNet and Cifar10 datasets, respectively, on an Nvidia RTX 2080Ti GPU, with negligible loss in classification accuracy.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

MixTrain: accelerating DNN training via input mixing.

Abstract

Published Version

Talk to us

Similar Papers

More From: Frontiers in artificial intelligence

Lead the way for us

Journal: Frontiers in artificial intelligence	Publication Date: Sep 4, 2024
License type: CC BY 4.0

Similar Papers

A Framework for Distributed Deep Neural Network Training with Heterogeneous Computing Platforms
Bontak Gu ... Young Geun Kim
-
Bontak Gu, et. al.Bontak Gu ... Young Geun Kim
01 Dec 2019
01 Dec 2019

Neuroevolution in Deep Neural Networks: Current Trends and Future Challenges
Edgar Galvan ... Peter Mooney
IEEE Transactions on Artificial Intelligence | VOL. 2
Edgar Galvan, et. al.Edgar Galvan ... Peter Mooney
04 May 2021
IEEE Transactions on Artificial Intelligence | VOL. 2

Accelerating DNN Training Through Selective Localized Learning.
Sarada Krithivasan ... Anand Raghunathan
Frontiers in Neuroscience | VOL. 15
Sarada Krithivasan, et. al.Sarada Krithivasan ... Anand Raghunathan
11 Jan 2022
Frontiers in Neuroscience | VOL. 15

DLB: A Dynamic Load Balance Strategy for Distributed Training of Deep Neural Networks
Qing Ye ... Jiancheng Lv
IEEE Transactions on Emerging Topics in Computational Intelligence | VOL. 7
Qing Ye, et. al.Qing Ye ... Jiancheng Lv
01 Aug 2023
IEEE Transactions on Emerging Topics in Computational Intelligence | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

MixTrain: accelerating DNN training via input mixing.

Abstract

Published Version

Talk to us

Similar Papers

More From: Frontiers in artificial intelligence