Towards a Scalable Software Defined Network-on-Chip for Next Generation Cloud.

Alberto Scionti,Somnath Mazumdar,Antoni Portero

doi:10.3390/s18072330

Abstract

The rapid evolution of Cloud-based services and the growing interest in deep learning (DL)-based applications is putting increasing pressure on hyperscalers and general purpose hardware designers to provide more efficient and scalable systems. Cloud-based infrastructures must consist of more energy efficient components. The evolution must take place from the core of the infrastructure (i.e., data centers (DCs)) to the edges (Edge computing) to adequately support new/future applications. Adaptability/elasticity is one of the features required to increase the performance-to-power ratios. Hardware-based mechanisms have been proposed to support system reconfiguration mostly at the processing elements level, while fewer studies have been carried out regarding scalable, modular interconnected sub-systems. In this paper, we propose a scalable Software Defined Network-on-Chip (SDNoC)-based architecture. Our solution can easily be adapted to support devices ranging from low-power computing nodes placed at the edge of the Cloud to high-performance many-core processors in the Cloud DCs, by leveraging on a modular design approach. The proposed design merges the benefits of hierarchical network-on-chip (NoC) topologies (via fusing the ring and the 2D-mesh topology), with those brought by dynamic reconfiguration (i.e., adaptation). Our proposed interconnect allows for creating different types of virtualised topologies aiming at serving different communication requirements and thus providing better resource partitioning (virtual tiles) for concurrent tasks. To further allow the software layer controlling and monitoring of the NoC subsystem, a few customised instructions supporting a data-driven program execution model (PXM) are added to the processing element’s instruction set architecture (ISA). In general, the data-driven programming and execution models are suitable for supporting the DL applications. We also introduce a mechanism to map a high-level programming language embedding concurrent execution models into the basic functionalities offered by our SDNoC for easing the programming of the proposed system. In the reported experiments, we compared our lightweight reconfigurable architecture to a conventional flattened 2D-mesh interconnection subsystem. Results show that our design provides an increment of the data traffic throughput of % and a reduction of of the average packet latency, compared to the flattened 2D-mesh topology connecting the same number of processing elements (PEs) (up to 1024 cores). Similarly, power and resource (on FPGA devices) consumption is also low, confirming good scalability of the proposed architecture.

Highlights

Cloud-based execution environments are in place to process the complex machine learning (ML)algorithms
To reduce the area and energy costs associated with the implementation of this look-up table (LUT), we found that providing up to 256 processing elements (PEs) in a single VT is enough for supporting ML/deep learning (DL) algorithm mapping well
Unlike the von Neumann execution model, data-driven models require a private block of memory to store inputs used by the threads to run, a counter storing the number of inputs still not received, and the pointer of the thread body

Summary

Introduction

Cloud-based execution environments are in place to process the complex machine learning (ML). There is a flurry of research for designing more efficient DL-based algorithms and custom hardware accelerators to execute them better (such as the Xilinx reconfigurable acceleration stack). Most of these accelerators are spatial (i.e., an array of interconnected PEs), with input data elaborated following a data-driven approach. Internal hardware counters are read via the dedicated instructions This information can be exploited by optimisation tools and compilers to better adapt to the communication patterns of an application. Productivity is improved by introducing data-driven PXM support into a high-level programming language, allowing the application developer to readily exploit the benefit of interconnection reconfigurability and data-driven execution.

System Overview

Challenges and State-of-the-Art

Paper Contribution

Network-on-Chip Architecture

Router Micro-Architecture

Data Packet Structure and Control Flow

Ring Switch Micro-Architecture

NoC Adaptability

Software Interface

High-Level Programming Interface

Data-Driven PXM

Mapping Goroutines on DD-Threads

Linking NoC Software Interface

Evaluation

Network Performance

Area Cost and Power Consumption

Conclusions and Future Work

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors (Basel, Switzerland)	Publication Date: Jul 18, 2018
Citations: 14	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Towards a Scalable Software Defined Network-on-Chip for Next Generation Cloud.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)

Lead the way for us

Similar Papers

Symbolic Loop Compilation for Tightly Coupled Processor Arrays
Michael Witterauf ... Frank Hannig
ACM Transactions on Embedded Computing Systems | VOL. 20
Michael Witterauf, et. al.Michael Witterauf ... Frank Hannig
29 Jul 2021
ACM Transactions on Embedded Computing Systems | VOL. 20

A low-area and low-latency network on chip
Xiaofang Wang ... Leeladhar Bandi
-
Xiaofang Wang, et. al.Xiaofang Wang ... Leeladhar Bandi
01 May 2010
01 May 2010

HDA: Hierarchical and dependency-aware task mapping for network-on-chip based embedded systems
Chun-Hsian Huang
Journal of Systems Architecture | VOL. 108
Chun-Hsian HuangChun-Hsian Huang
11 Feb 2020
Journal of Systems Architecture | VOL. 108

Parameterized Mapping of Algorithms onto Processor Arrays with Sub-Word Parallelism
Rainer Schaffer ... Renate Merker
-
Rainer Schaffer, et. al.Rainer Schaffer ... Renate Merker
01 Jul 2006
01 Jul 2006

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Towards a Scalable Software Defined Network-on-Chip for Next Generation Cloud.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)