Abstract
Time series forecasting is the problem of predicting future data samples from historical information and recent deep neural networks (DNNs) based techniques have achieved excellent results compared with conventional statistical approaches. Many applications at the edge can utilise this technology and most implementations have focused on inference, an ability to train at the edge would enable the deep neural network (DNN) to adapt to changing conditions. Unfortunately, training requires approximately three times more memory and computation than inference. Moreover, edge applications are often constrained by energy efficiency. In this work, we implement a block minifloat (BM) training accelerator for a time series prediction network, N-BEATS. Our architecture involves a mixed precision GEMM accelerator that utilizes block minifloat (BM) arithmetic. We use a 4-bit DSP packing scheme to optimize the implementation further, achieving a throughput of 779 Gops. The resulting power efficiency is 42.4 Gops/W, 3.1x better than a graphics processing unit in a similar technology.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Similar Papers
More From: ACM Transactions on Reconfigurable Technology and Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.