As the timing guardband consumes more and more design margin with the technology scaling, better-than-worst-case (BTWC) techniques have gained more attention as a promising solution. BTWC techniques can relax the design margin by transcending the pessimistic static timing constraints and utilizing the dynamic timing information. However, to guarantee the design reliability throughout the lifetime, the conventional dynamic timing analysis (DTA) engines need an extra reliability guardband, which is commonly evaluated under the worst-case corners of aging and variation. This type of guardbanding consumes the precious design margin, thus hindering the efficiency improvement from BTWC techniques. Therefore, in this paper, we propose AVATAR, an aging-and variation-aware dynamic timing analyzer that can perform DTA with the impact of transistor aging and random process variation, including the gate-level aging analysis and random variation model that can accurately calculate cell delay under the impact of transistor aging and random variation, and an event-based DTA algorithm that avoids the pessimistic property of graph-based analysis. We also propose an ML-assisted DTA acceleration flow for the multicycle DTA of homogeneous multicore designs. We present two case studies using AVATAR to show its effectiveness. First, we present an application-based dynamic-voltage-frequency-scaling (DVFS) design methodology based on AVATAR, which can exploit application-level dynamic timing slack (DTS) to improve energy efficiency and performance. The results demonstrate that, compared to the design based on the conventional corner-based DTA, the additional performance improvement of the design based on AVATAR can be up to 14% or the additional power-saving can be up to 20%. Second, we demonstrate using the proposed ML-assisted acceleration flow for reliability-aware deep neural network (DNN) accelerator simulation. We use the proposed flow to estimate the impact of timing errors due to aging and random variation on the inference accuracy of two benchmark DNNs. The results demonstrate that the proposed acceleration flow achieves up to 10W speedup with an average error of less than 2%.
Read full abstract