Multi-ANN embedded system based on a custom 3D-DRAM

Lee B Baker,Paul Franzon

doi:10.1109/s3s.2018.8640177

Abstract

Machine Learning in the form of Artificial Neural Networks (ANNs) has gained traction over the last few years especially in applications such as image recognition and speech recognition. These particular applications typically employ a subset of ANNs known as Convolutional Neural Networks (CNNs) which re-use parameters and thus reduce main memory bandwidth. However, there are other types of ANN that do not provide reuse opportunities such as autoencoders and Long Short-term memory (LSTM). It is generally accepted that dynamic random-access memory (DRAM) is required to store the ANN parameters of useful sized ANNs. To achieve a given performance, CNN-specific implementations utilize cache-like structures using static random-access memory (SRAM) which mimimizes accesses to the slower DRAM. Most research has focused on implementing CNNs but because of their extensive use of SRAM have both ANN size restrictions and performance degradation when used in applications that utilize other types of ANN. This work considers embedded applications employing multiple disparate generic ANNs which, assuming there are limited reuse opportunities in the form of re-use or batch processing, will require usable memory bandwidth on the order of tens of Tbit/s. This work provides support to Deep Neural Networks (DNNs) that do not provide ANN parameter reuse and suggests that these types of applications will require that all ANN parameters in main memory be accessed in real-time. This work coins the phrase “goldilocks bandwidth” when applied to ANN systems where the system provides the bandwidth required to read all ANN parameters at a real-time rate. This work employs pure 3DIC technology along with a proposed custom 3D-DRAM which exposes an entire page over a very wide databus (Fig 3). The 3DIC system die stack (Fig 1) includes the 3D-DRAM, a system manager layer and a Processing Engine (PE) layer collectively known as a Sub-System Column (SSC) (Fig 4). The targeted 3D-DRAM, the Tezzaron DiRAM4 [1]employs multiple memory array layers in conjunction with a control and IO layer and provides 64 separate vaults each providing 1 Gbit of storage which along with the suggested customizations provides this work up to 65 Tbit/s.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-ANN embedded system based on a custom 3D-DRAM

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A deep learning approach to predict and optimise energy in fish processing industries
Ali Ghoroghi ... Ateyah Alzahrani
Renewable and Sustainable Energy Reviews | VOL. 186
Ali Ghoroghi, et. al.Ali Ghoroghi ... Ateyah Alzahrani
27 Aug 2023
Renewable and Sustainable Energy Reviews | VOL. 186

Classification of Weeds Detection Control Management Using Artificial and Deep Convolutional Neural Networks
Nagaraj P ... G Bhanu Prakash Yadav
-
Nagaraj P, et. al.Nagaraj P ... G Bhanu Prakash Yadav
17 Mar 2023
17 Mar 2023

COVID-19 special issue: Intelligent solutions for computer communication-assisted infectious disease diagnosis.
Fadi Al‐Turjman
Expert systems | VOL. 39
Fadi Al‐TurjmanFadi Al‐Turjman
24 Feb 2022
Expert systems | VOL. 39

Determining the Eligibility of Candidates for a Vacancy Using Artificial Neural Networks
Maksym Lupei ... Vasyl Sharkan
-
Maksym Lupei, et. al.Maksym Lupei ... Vasyl Sharkan
01 Aug 2020
01 Aug 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-ANN embedded system based on a custom 3D-DRAM

Abstract

Talk to us

Similar Papers