Abstract

This paper presents a frame-interleaved low-density parity-check (LDPC) decoder architecture with a new interconnect partitioning scheme and time-distributed Min-Sum decoding schedule. The architecture exploits the cyclic structure of the parity-check matrix by unrolling each check node update in time over all connected variable nodes in order to minimize wiring complexity and power. Multiple frames are interleaved to maximize hardware utilization, while coarse-grained clock gating is used to systematically turn off inactive logic and memories to save power. To demonstrate the scalability of the proposed architecture, a multirate LDPC decoder test chip was fabricated for the IEEE 802.11ad standard in the 28-nm CMOS technology node. The design occupies an area of 1.99 mm2, contains 160 Kb of embedded static random-access memory, and achieves a throughput of 6.78 Gb/s at 10 decoding iterations for all four code rates specified in the standard. With early decoding termination, the fabricated chip consumes between 104 and 279 mW of power at a target bit error rate of 10−6 under nominal operation at 0.9-V supply and 202-MHz clock rate, resulting in an energy efficiency between 1.53 and 4.12 pJ/bit/iteration. With clock-frequency and voltage scaling, the fabricated chip achieves an energy efficiency between 1.1 and 3.1 pJ/bit/iteration. This paper achieves the highest normalized energy efficiency among recently published CMOS-based decoders for the IEEE 802.11ad standard at nominal clock frequency and supply voltage.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call