Abstract
New superconductor single flux quantum (SFQ) technology, such as Reciprocal Quantum Logic (RQL), is currently considered one of the promising candidates for highperformance energy-efficient computing. This paper presents our work on the design and detailed energy efficiency analysis of three types of 32- and 64-bit RQL multi-ported pipelined local storage structures (13 total), namely 1) random access memory (RAM) and register files, 2) direct-mapped write-through and write-back caches, and 3) first-in-first-out (FIFO) buffers. Our layout-aware cell-level design process uses a VHDL RQL cell library developed at the Ultra High Speed Computing Laboratory at Stony Brook University (SBU). The SBU VHDL RQL cell library specifies the dynamic and standby energy consumption, gate delays, a number of Josephson junctions (JJs) per cell, and approximate sizes of individual cells based on the parameters of the 248 nm 100 μA/μm2 10 Nb metal layer SFQ fabrication process currently under development at the MIT Lincoln Laboratory. Gate and wire delays as well as clock skew are taken into account during digital circuit simulation done with Mentor Graphics CAD tools. After completing a physical chip layout, the circuit models need to be updated and re-simulated to include the effects of parasitic inductances and actual wire lengths on signal propagation delays. To meet both performance and energy efficiency targets, the RQL storage structures were designed with RQL non-destructive read-out single-bit storage cells. We chose a relatively moderate clock frequency of 8.5 GHz for all storage units to keep their read latencies in the range of 1- 3 cycles. The most complex design in terms of JJs is a tripleported 4 Kbit 64x64-bit register file with 253,918 JJs and its read access latency of 338 ps. The highest energy consumption in terms of energy/operation/bit (~9.5 aJ at 4.2 K) is for a write hit in a 2 Kbit 32-bit wide write-back cache. The average energy consumption of the RQL storage designs varies from ~1.6 aJ/operation/bit for a small 4x32-bit FIFO to 7.3 aJ/operation/bit for the 2 Kbit write-back cache at 4.2 K. Given the cryocooler efficiency of 0.1%, this means the energy consumption of ~1.6-7.3 fJ/operation/bit at room temperature. The physical implementation of the RQL storage units will become feasible upon the development of the target MIT fabrication process and CAD tools for VLSI RQL chip design in 2015-2016.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have