Abstract

Deep learning (DL) has proven to be one of the most pivotal components of machine learning given its notable performance in a variety of application domains. Neural networks (NNs) for DL are tailored to specific application domains by varying in their topology and activation nodes. Nevertheless, the major operation type (with the largest computational complexity) is commonly multiply‐accumulate operation irrespective of their topology. Recent trends in DL highlight the evolution of NNs such that they become deeper and larger, and thus their prohibitive computational complexity. To cope with the consequent prohibitive latency for computation, 1) general‐purpose hardware, e.g., central processing units and graphics processing units, has been redesigned, and 2) various DL accelerators have been newly introduced, e.g., neural processing units, and computing‐in‐memory units for deep NN‐based DL, and neuromorphic processors for spiking NN‐based DL. In this review, these accelerators and their pros and cons are overviewed with particular focus on their performance and memory bandwidth.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.