CHAOS: a parallelization scheme for training convolutional neural networks on Intel Xeon Phi

André Viebke,Suejb Memeti,Ajith Abraham,Sabri Pllana

doi:10.1007/s11227-017-1994-x

André Viebke, Suejb Memeti + Show 2 more

Open Access

https://doi.org/10.1007/s11227-017-1994-x

Copy DOI

Abstract

Deep learning is an important component of Big Data analytic tools and intelligent applications, such as self-driving cars, computer vision, speech recognition, or precision medicine. However, the training process is computationally intensive and often requires a large amount of time if performed sequentially. Modern parallel computing systems provide the capability to reduce the required training time of deep neural networks. In this paper, we present our parallelization scheme for training convolutional neural networks (CNN) named Controlled Hogwild with Arbitrary Order of Synchronization (CHAOS). Major features of CHAOS include the support for thread and vector parallelism, non-instant updates of weight parameters during back-propagation without a significant delay, and implicit synchronization in arbitrary order. CHAOS is tailored for parallel computing systems that are accelerated with the Intel Xeon Phi. We evaluate our parallelization approach empirically using measurement techniques and performance modeling for various numbers of threads and CNN architectures. Experimental results for the MNIST dataset of handwritten digits using the total number of threads on the Xeon Phi show speedups of up to 103times compared to the execution on one thread of the Xeon Phi, 14times compared to the sequential execution on Intel Xeon E5, and 58times compared to the sequential execution on Intel Core i5.

Highlights

Engineers developed applications by specifying computer instructions that determined the application behavior
We introduce Controlled Hogwild with Arbitrary Order of Synchronization (CHAOS), a parallelization scheme that can exploit both thread- and SIMD-level parallelism available on Intel Xeon Phi
Deep learning is important for many modern applications, such as voice recognition, face recognition, autonomous cars, precision medicine, or computer vision

Summary

Introduction

Engineers developed applications by specifying computer instructions that determined the application behavior. Nowadays engineers focus on developing and implementing sophisticated deep learning models that can learn to solve complex problems. Deep learning algorithms [28] can learn from their own experience rather than that of the engineer. A similar analogy can be drawn for images, where edges and corners are lower-level abstractions translated into more spatial patterns on higher levels. It is known that the animal cortex consists of both simple and complex cells firing on certain visual inputs in their receptive fields. Simple cells detect edge-like patterns, whereas complex cells are locally invariant, spanning larger receptive fields. These are the very fundamental properties of the animal brain inspiring DNNs and CNNs

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: The Journal of Supercomputing	Publication Date: Mar 6, 2017
Citations: 22	License type: open-access

R Discovery Prime

R Discovery Prime

CHAOS: a parallelization scheme for training convolutional neural networks on Intel Xeon Phi

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The Journal of Supercomputing

Lead the way for us

Similar Papers

The Potential of the Intel (R) Xeon Phi for Supervised Deep Learning
Andre Viebke ... Sabri Pllana
-
Andre Viebke, et. al.Andre Viebke ... Sabri Pllana
01 Jan 2015
01 Jan 2015

Performance Characterization and Optimization for Intel Xeon Phi Coprocessor
Cheng Zhang ... Guangwen Yang
-
Cheng Zhang, et. al.Cheng Zhang ... Guangwen Yang
01 Jan 2015
01 Jan 2015

Efficient Strategies of Compressing Three-Dimensional Sparse Arrays Based on Intel XEON and Intel XEON Phi Environments
Chun-Yuan Lin ... Che-Lun Hung
-
Chun-Yuan Lin, et. al.Chun-Yuan Lin ... Che-Lun Hung
01 Oct 2015
01 Oct 2015

On the Mitigation of Cache Hostile Memory Access Patterns on Many-Core CPU Architectures
Tom Deakin ... Simon Mcintosh-Smith
-
Tom Deakin, et. al.Tom Deakin ... Simon Mcintosh-Smith
01 Jan 2017
01 Jan 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CHAOS: a parallelization scheme for training convolutional neural networks on Intel Xeon Phi

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: The Journal of Supercomputing