Abstract

This article overviews the emerging use of deep neural networks in data analytics and explores which type of underlying hardware and architectural approach is best used in various deployment locations when implementing deep neural networks. The locations which are discussed are in the cloud, fog, and dew computing (dew computing is performed by end devices). Covered architectural approaches include multicore processors (central processing unit), manycore processors (graphics processing unit), field programmable gate arrays, and application-specific integrated circuits. The proposed classification in this article divides the existing solutions into 12 different categories, organized in two dimensions. The proposed classification allows a comparison of existing architectures, which are predominantly cloud-based, and anticipated future architectures, which are expected to be hybrid cloud-fog-dew architectures for applications in Internet of Things and Wireless Sensor Networks. Researchers interested in studying trade-offs among data processing bandwidth, data processing latency, and processing power consumption would benefit from the classification made in this article.

Highlights

  • Over the last several years, with proliferation of data and devices, machine learning algorithms have become unavoidable in almost every aspect of human life,[1] especially when deep learning (DL) techniques are used

  • Training and inference of deep neural networks (DNNs) could be done in the cloud, with enormous computational power, as well as hundreds of kilometers away from data sources, with a less strong computational power in the fog, but much closer to data sources, and in the dew, at the source of data, where processing is closely coupled with

  • In our search, which we conducted using Google Scholar database, we focused on keywords related to DNNs and hardware acceleration

Read more

Summary

Introduction

Over the last several years, with proliferation of data and devices, machine learning algorithms have become unavoidable in almost every aspect of human life,[1] especially when deep learning (DL) techniques are used. This problem, many new developments are switching from the control-flow paradigm to the dataflow paradigm.[32,33] Cloud services which rely on FPGA, such as Baidu XPU, still cannot offer performance of today’s GPUs or TPUs (tensor processing units), but offer better energy efficiency for training and inference of DNNs. Companies like Baidu, Intel, and Microsoft have cloud services which rely on FPGAs, which under certain conditions of interest could achieve better performance per watt. In terms of performance on ResNet-50 (CNN), two cloud-based services, Google TPU and Amazon EC2, which rely on TPUv2 ASIC and Nvidia V100 GPU, are

43 Google TPUv2 43 Nvidia V100
Findings
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.