Abstract

For innovative portable products, Systems on Chips (SoCs) containing several processors, memories and specialised modules are obviously required. Performances but also low-power are main issues in the design of such SoCs. Are these low-power SoCs only constructed with low-power processors, memories and logic blocks? If the latter are unavoidable, many other issues are quite important for low-power SoCs, such as the way to synchronise the communications between processors as well as test procedures, online testing, software design and development tools. This paper is a general framework for the design of low-power SoCs, starting from the system level to the architecture level, assuming that the SoC is mainly based on the re-use of low-power processors, memories and logic peripherals. SoCs with many processors, co-processors, memories and peripherals cannot be synchronised with a single master clock, due to larger and larger wire delays in deep submicron technologies. Several clocking schemes have been proposed, such as GALS (Globally Asynchronous Locally Synchronous) but also full asynchronous architectures. This paper will present the advantages and disadvantages of these SoC clocking strategies as well as the impacts on low power. For embedded SoCs containing several processors, one has to write several pieces of software for each processor starting typically from a high-level specification using the C/C++ language. In order to tackle this problem, we propose to first transform the original specification by means of a systematic script of platform-independent source code transformations. That is illustrated by applying global loop transformation techniques to identify asynchronous partitions exhibiting little communication and high locality of access characteristics. In a second stage, we explore multiple-instruction multiple-data (MIMD) mapping onto a given (partly) predefined platform using advanced space-time analysis techniques to maintain low data transfer rates while achieving high system throughput. At the SoC level, accurate cost feedback including high-level power estimation is required. From this essential information, energy trade-offs between application sub-modules can for example be used to refine the solution further. In the case of mapping onto programmable cores with a shared memory hierarchy, a final refinement consists in reorganising the data layout for efficient cache utilisation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.