Abstract

This study presents an efficient and rapid implementation of Stochastic Computing (SC) based Deep Neural Network (DNN) on a low-cost hardware platform. The proposed technique uses bipolar signal encoding in stochastic computing which relatively gives low hardware footprint compared to binary computing. Thereinafter, stochastic max function is presented and subsequently used to approximate the hyperbolic tangent activation function in SC. In addition, saturation arithmetic is proposed to reduce down scaling parameters that can further affect precision in computation. In this study, we demonstrate our SC-based DNN feasibility through a hardware accelerator prototype with the AXI Stream interface on a PYNQ Z2 board which is equipped with a XILINX ZYNQ XC7Z020-1CLG400C. The validity of this study is demonstrated through a MNIST handwritten digit recognition task. The experimental result shows our SC-based DNN model can be easily deployed on the embedded devices. The SC-based accelerator with AXI Stream interface performs at 1.877 GOP/s processing throughput, achieves higher accuracy with minimum area and energy consumption, consuming only 0.61 mm2 area and 1.89W power.

Highlights

  • Humans have always dreamt of creating intelligent machines that can think

  • This study considers stochastic computing, a low-cost alternative to conventional binary computing to implement modern deep neural networks

  • It was found that the worst case scaling parameters that are inherently introduced by stochastic arithmetic tend to be overly pessimistic, undermining the implementation of neural network inference in Stochastic Computing (SC)

Read more

Summary

Introduction

Humans have always dreamt of creating intelligent machines that can think. Today, Artificial Intelligence (AI) is a thriving field with many active research topics and practical applications. Deep Neural Networks (DNNs) have achieved unprecedented success in many machine learning applications such as speech recognition (Abdel-Hamid et al, 2014) and visual object recognition (Simonyan and Zisserman, 2014) Such tasks are intuitively solved by humans, they originally proved to be the true challenge to artificial intelligence. Such high performance computing clusters incur high power consumption and a large hardware cost, thereby limiting their suitability for lowcost applications such as embedded and wearable IoT devices that require low power consumption and small hardware footprint (Ren et al, 2017) These applications increasingly utilise machine learning algorithms to perform fundamental tasks such as natural language processing, speech to text transcription as well as image and video recognition (LeCun et al, 2015).

A Novel implementation of SC-based DNN on PYNQ Z2 FPGA
Related Work
Experimental Results
Conclusion
Funding Information
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.