Abstract

In the last 15 years we have seen, as a response to power and thermal limits for current chip technologies, an explosion in the use of multiple and even many computer cores on a single chip. But now, to further improve performance and energy efficiency, when there are potentially hundreds of computing cores on a chip, we see a need for a specialization of individual cores and the development of heterogeneous manycore computer architectures.However, developing such heterogeneous architectures is a significant challenge. Therefore, we propose a design method to generate domain specific manycore architectures based on RISC-V instruction set architecture and automate the main steps of this method with software tools. The design method allows generation of manycore architectures with different configurations including core augmentation through instruction extensions and custom accelerators. The method starts from developing applications in a high-level dataflow language and ends by generating synthesizable Verilog code and cycle accurate emulator for the generated architecture.We evaluate the design method and the software tools by generating several architectures specialized for two different applications and measure their performance and hardware resource usages. Our results show that the design method can be used to generate specialized manycore architectures targeting applications from different domains. The specialized architectures show at least 3 to 4 times better performance than the general purpose counterparts. In certain cases, replacing general purpose components with specialized components saves hardware resources. Automating the method increases the speed of architecture development and facilitates the design space exploration of manycore architectures.

Highlights

  • Applications of today such as audio/vision processing, wireless communication and machine learning, process massive amount of data produced by all kinds of sensors

  • We used the parallel implementations that we developed in this study together with the sequential implementation from a prior study [10] of the autofocus criterion calculation to evaluate the generation of single core, dual core and manycore architectures

  • In this paper we address this challenge and propose a design method that can generate application specific or domain specific manycore architectures with software tools that automate the steps of the method

Read more

Summary

Introduction

Applications of today such as audio/vision processing, wireless communication and machine learning, process massive amount of data produced by all kinds of sensors These applications require high computation power in order to provide results in a reasonable amount of time. One major solution to the continuous demand for higher computation power has been the technology development. It enables higher chip densities and higher clock rates to provide higher computation power. There are thermal limitations to the chip temperature [1] These limitations have led the industry to processors with lower clock rates and higher numbers of cores, which can run in parallel.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call