Exploring Online Synthesis for CGRAs with Specialized Operator Sets

Stefan Döbrich,Christian Hochberger

doi:10.1155/2011/601986

Abstract

The design of energy-efficient systems has become a major challenge for engineers over the last decade. One way to save energy is to spread out computations in space rather than in time (as traditional processors do). Unfortunately, this requires to design specialized hardware for each application. Also, the nonrecurring expenses for the manufacturing of chips continuously grow. Implementing the computations on FPGAs and CGRAs solves this dilemma, as the non recurring expenses are shared between many different applications. We believe that online synthesis that takes place during the execution of an application is one way to broaden the applicability of reconfigurable architectures as no expert knowledge of synthesis and technologies is required. In this paper, we give a detailed analysis of the amount and specialization of resources in a CGRA that are required to grant a significant speedup of Java bytecode. In fact, we show that even a relatively small number of specialized reconfigurable resources is sufficient to speed up applications considerably. Particularly, we look at the number of dedicated multipliers and dividers. Also, we discuss the required number of concurrent memory access operations inside the CGRA. Again, it shows that two concurrent memory access operations are sufficient for almost all applications.

Highlights

Designers of almost all types of systems experience a continuously increased demand for performance and/or higher energy efficiency
We evaluated the influence of different Coarse-grain reconfigurable arrays (CGRAs) characteristics on the runtime of synthesized functional units
Since fine-grained logic requires a large amount of configuration data to be computed and since the fine grain structure is neither required nor helpful for the implementation of most code sequences, we focus on CGRAs for the inclusion into AMIDAR processors

Summary

Introduction

Designers of almost all types of systems experience a continuously increased demand for performance and/or higher energy efficiency. Various options are intensely discussed to satisfy this demand. The currently most often named technology is multicore processors. Using them to gain substantial performance improvements is a rather involved process, and, up till it is a technology which is typically mainly found in desktop and server systems. Dual or quad core systems are emerging in the area of embedded systems. Popular technologies like general purpose graphics processors (GPGPU) consume vast amounts of energy and require very specialized programming environments (e.g., OpenCL or CUDA)

Objectives

Methods

Results

Conclusion