Abstract

Deep learning is an important component of Big Data analytic tools and intelligent applications, such as self-driving cars, computer vision, speech recognition, or precision medicine. However, the training process is computationally intensive and often requires a large amount of time if performed sequentially. Modern parallel computing systems provide the capability to reduce the required training time of deep neural networks. In this paper, we present our parallelization scheme for training convolutional neural networks (CNN) named Controlled Hogwild with Arbitrary Order of Synchronization (CHAOS). Major features of CHAOS include the support for thread and vector parallelism, non-instant updates of weight parameters during back-propagation without a significant delay, and implicit synchronization in arbitrary order. CHAOS is tailored for parallel computing systems that are accelerated with the Intel Xeon Phi. We evaluate our parallelization approach empirically using measurement techniques and performance modeling for various numbers of threads and CNN architectures. Experimental results for the MNIST dataset of handwritten digits using the total number of threads on the Xeon Phi show speedups of up to 103times compared to the execution on one thread of the Xeon Phi, 14times compared to the sequential execution on Intel Xeon E5, and 58times compared to the sequential execution on Intel Core i5.

Highlights

  • Engineers developed applications by specifying computer instructions that determined the application behavior

  • We introduce Controlled Hogwild with Arbitrary Order of Synchronization (CHAOS), a parallelization scheme that can exploit both thread- and SIMD-level parallelism available on Intel Xeon Phi

  • Deep learning is important for many modern applications, such as voice recognition, face recognition, autonomous cars, precision medicine, or computer vision

Read more

Summary

Introduction

Engineers developed applications by specifying computer instructions that determined the application behavior. Nowadays engineers focus on developing and implementing sophisticated deep learning models that can learn to solve complex problems. Deep learning algorithms [28] can learn from their own experience rather than that of the engineer. A similar analogy can be drawn for images, where edges and corners are lower-level abstractions translated into more spatial patterns on higher levels. It is known that the animal cortex consists of both simple and complex cells firing on certain visual inputs in their receptive fields. Simple cells detect edge-like patterns, whereas complex cells are locally invariant, spanning larger receptive fields. These are the very fundamental properties of the animal brain inspiring DNNs and CNNs

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.