Streaming Batch Eigenupdates for Hardware Neural Networks.

Brian D Hoskins,Advait Madhavan,Siyuan Huang,Jabez J Mcclelland,Gina C Adam,Mark D Stiles,Nikolai Zhitenev,Matthew W Daniels

doi:10.3389/fnins.2019.00793

Abstract

Neural networks based on nanodevices, such as metal oxide memristors, phase change memories, and flash memory cells, have generated considerable interest for their increased energy efficiency and density in comparison to graphics processing units (GPUs) and central processing units (CPUs). Though immense acceleration of the training process can be achieved by leveraging the fact that the time complexity of training does not scale with the network size, it is limited by the space complexity of stochastic gradient descent, which grows quadratically. The main objective of this work is to reduce this space complexity by using low-rank approximations of stochastic gradient descent. This low spatial complexity combined with streaming methods allows for significant reductions in memory and compute overhead, opening the door for improvements in area, time and energy efficiency of training. We refer to this algorithm and architecture to implement it as the streaming batch eigenupdate (SBE) approach.

Highlights

Deep neural networks (DNNs) have grown increasingly popular over the years in a wide range of fields from image recognition to natural language processing
We focus on backpropagation-based learning in a layer in a deep neural network where the weights for that layer are stored in a memristor crossbar array
The streaming batch eigenupdate (SBE) approach is lower performing than the MBGD approach in terms of number of epochs to train and number of matrix updates

Summary

Introduction

Deep neural networks (DNNs) have grown increasingly popular over the years in a wide range of fields from image recognition to natural language processing These systems have enormous computational overhead, on multiply and accumulate (MAC) operations, and specialized hardware has been developed to accelerate these tasks. Investigations regarding an appropriate nanodevice suitable for analog inference have focused on different families of 2-terminal memory devices (memristors, resistive random-access memory (ReRAM), phase change memories (PCM), etc.) as well as 3 terminal devices (flash memory, lithium insertion) (Haensch et al, 2019) These devices have the desirable properties of analog tunability, high endurance, and long-term memory needed for use in embedded inference applications. Applications based on these devices perform well when used for inference and have been wellstudied, with intermediate scale systems having been built by integrating devices into crossbar arrays (Prezioso et al, 2015; Adam et al, 2017; Chakrabarti et al, 2017; Wang et al, 2018)

Objectives

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in Neuroscience	Publication Date: Aug 6, 2019
Citations: 26	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Streaming Batch Eigenupdates for Hardware Neural Networks.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Neuroscience

Lead the way for us

Similar Papers

Dynamic Heterogeneous scheduling of GPU-CPU in Distributed Environment
Suman Goyat ... Shri Kant
-
Suman Goyat, et. al.Suman Goyat ... Shri Kant
01 Nov 2019
01 Nov 2019

Reduction of computing time for seismic applications based on the Helmholtz equation by Graphics Processing Units

-

03 Mar 2015
03 Mar 2015

SPARCNet
Adam Page ... Ali Jafari
ACM Journal on Emerging Technologies in Computing Systems | VOL. 13
Adam Page, et. al.Adam Page ... Ali Jafari
12 May 2017
ACM Journal on Emerging Technologies in Computing Systems | VOL. 13

Efficient Utilization of a CPU-GPU Cluster
Gopal Patnaik ... David Fyfe
-
Gopal Patnaik, et. al.Gopal Patnaik ... David Fyfe
09 Jan 2012
09 Jan 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Streaming Batch Eigenupdates for Hardware Neural Networks.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Neuroscience