Abstract

The sustained increase in computational performance demanded by next-generation applications drives the increasing core counts of modern multiprocessor systems. However, in the dark silicon era, the performance levels and integration density of such systems is limited by thermal constraints of their physical package. These constraints are more severe in the case of three-dimensional (3D) integrated systems, as a consequence of the complex thermal characteristics exhibited by stacked silicon dies. This dissertation investigates the development of efficient, thermal-aware multiprocessor architectures, and presents methodologies to enable the simultaneous exploration of their thermal and functional behaviour. Chapter 2 examines the efficiency of multiprocessor architectures from the perspective of the the memory hierarchy, and presents techniques that focus on the effective management and transfer of on-chip data in order to minimize the time spent waiting on memory accesses. In the case of shared-memory multiprocessors, this is achieved through the proposed Persistence Selective Caching (PSC) and CacheBalancer schemes that influence what data is stored in on-chip caches, where it is stored, and for how long. This enables the memory hierarchy to adapt to changing execution behaviour, balance resource utilization, and most importantly, reduce the average latency and energy per memory access. Further to this, Chapter 2 presents the Pronto system, which enables efficient data transfers in message-passing multiprocessors by minimizing the role of the processing element in the management of transfers. Pronto effectively decreases the overheads incurred in setting up and managing data transfers, thereby yielding shorter communication latencies. In addition, it also simplifies the semantics of data movement by abstracting implementation details of communications from the programmer, thus enabling transfers to be specified entirely at the task level. The issue of thermal-aware design for 3D Integrated Circuits (IC) using Nagata’s equation – a mathematical representation of the dark silicon problem – is investigated in Chapter 3. Significantly, the chapter explores the thermal design space of 3D ICs in terms of this equation, and proposes a high-level flow to characterize the specific influence of individual design parameters on thermal behaviour of die stacks. The results of this exploration advance the state-of-the-art by providing new insights into the critical role of power density, thermal conductivity and stack construction in the formation of hotspots in 3D ICs. Building on these insights, the Ctherm framework is proposed for the thermal-aware design of multiprocessor systems-on-chip (MPSoC). Ctherm enables the concurrent evaluation of thermal and functional performance of MPSoCs using automatically generated fine-grained area, latency and energy models for system components, and facilitates the exploration of thermal behaviour early in the system design flow. The efficacy of the framework is demonstrated using a number of practical design cases ranging from floorplanning and temperature sensor placement to application tuning. Together, the characterization and the Ctherm framework further our understanding of the thermal behaviour of die stacks, and provide a practical template for the realization of thermal-aware electronic design automation tooling for 3D ICs. The management of thermal issues that arise in 3D MPSoCs at runtime is examined in Chapter 4. Temperature control is typically exercised by means of Dynamic Thermal Management (DTM) which continuously adapt the activity and power dissipation of system components. A significant disadvantage of state-of-the-art DTMs lies in their inability to account for the non-uniform thermal behaviour of die stacks, leading to the ineffective management of temperatures and in degraded system performance. In Chapter 4, a novel 3D Dynamic Voltage Frequency Scaling (DVFS) scheme is proposed that takes these non-uniformities into account within its power management algorithm, effectively maintains operating temperatures within a safe range, and maximizes system performance within the available thermal margins at individual processing elements. Furthermore, the chapter also presents an adaptive routing strategy to decrease the magnitude of thermal gradients in network-on-chip based 3D architectures, by directing traffic along paths of low temperature. The proposed Immediate Neighbourhood Temperature (INT) adaptive routing scheme actively steers interconnect traffic away from regions with thermal hotspots based only on temperature information available in the immediate neighbourhood, relying on the heat transfer characteristics of 3D ICs to avoid the need for a global temperature monitoring network. The consequent spreading of interconnect activity over multiple paths results in balanced thermal profiles, and decreased operating temperatures across the system. Over the course of these chapters, this dissertation explores the critical issues impeding the realization of thermal-aware 3D stacked multiprocessors, and details a multifaceted approach towards addressing the challenges of dark silicon.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call