Abstract

The explosive growth of deep learning (DL)–based artificial intelligence (AI) applications necessitates extraordinary computing capabilities that cannot be achieved using traditional CPU standalone computing. Therefore, the heavy mission-critical DL kernel computing currently relies on a heterogeneous computing (HGC) platform integrated with CPUs, GPUs, and accelerators, as well as substantial data storage elements. However, the metallic electrical interconnection in the existing manycore platform would not be sustainable for handling the massively increasing bandwidth demand of big data driven AI applications. Incorporating an optical network-on-chip (ONoC) for providing ultrahigh bandwidth, we propose a rapid topology generation and core mapping of ONoC (REGO) for energy-efficient HGC multicore architecture. The genetic algorithm (GA)-based REGO utilizes the structural characteristics of the optical router to the fitness function and thus compromises the trade-off between the required throughput, optical signal-to-noise ratio (OSNR), and total energy consumption. Furthermore, the crossover step accelerates the convergence speed by suppressing randomness in the GA, thus significantly reducing excessive running time owing to the NP-hard property. The generated ONoC through REGO demonstrates, on an average, an increase of 63.29 % and 22.80 % in throughput and a decrease of 50.24 % and 9.56 % in energy per bit, in the VGG-16 and VGG-19 compared with the conventional mesh- and torus-topology-based ONoCs, respectively.

Highlights

  • Deep learning (DL), a class of machine learning algorithms, trains a nonlinear function approximator represented by a deep neural network (DNN) architecture using input-output pairs of training data [1]

  • An optical NoC (ONoC) based on silicon photonics is being actively investigated as an alternative to electrical NoCs (ENoCs)

  • EVALUATION A SystemC-based cycle-accurate simulator was built with ONoC parameters extracted through the linear optical device model (LODM) proposed in [12]

Read more

Summary

INTRODUCTION

Deep learning (DL), a class of machine learning algorithms, trains a nonlinear function approximator represented by a deep neural network (DNN) architecture using input-output pairs of training data [1]. Tahir et al proposed a congestion-aware core mapping scheme using betweenness centrality that can identify highly loaded NoC links in [27] These studies were derived from a lightweight computing algorithm that swaps the location of the core, and it is difficult to apply it to a topology generation method that requires consideration of various network conditions. While the sensitivity and Le of the photodetector are hardware constraints that are not controllable, the OSNR and the number of MR heaters that significantly affect the total power consumption of ONoCs have strong correlation with the implementation style Both OSNR and the number of MR iii heaters must be assessed in the process of GA-based topology generation and core mapping. All chromosomes go through the initialization phase, which creates a iv

PROBLEM DEFINITION AND TERMINOLOGY
C Cset λ ξ μ ps pmin pmax
EVOLUTION PHASE
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call