Abstract

Neural networks, as powerful models for solving many genuine problems, are suffering from the issue of being too computationally intensive due to the excessive size of the models themselves and the large datasets used for training. This paper’s work focuses on a parallel algorithm, experimented on the classical convolutional neural network LeNet-5, aiming to reduce the time it takes for the neural network to train. This applies a massively parallel GPU structure to take full advantage of its high computing power. Among the approaches to improve the performance of parallel algorithms, many efforts have been made to improve the design of parallel structures, while this work focuses more on the enhancement of parallel algorithms by reasonable settings of neural network hyperparameters. This paper tries to approach (1) reducing the running time by testing parallelism with different numbers of GPUs and (2) further investigating the relationship between parallelism and model accuracy degradation and experimentally deriving a parallel scheme that can reduce the training time while guaranteeing accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call