Abstract

This paper presents the design and software implementation of a high-performance area-efficient Advanced Encryption Standard (AES) cipher on a many-core platform. A preliminary cipher design is partitioned and mapped to an array of 70 small processors, and offers a throughput of 16.625 clock cycles per byte. The usage of instruction and data memory, and the workload of each processor are characterized for further optimization. Through workload balancing and processor fusion, the throughput of the cipher is increased by 43% to 9.5 clock cycles per byte, while the number of processors utilized is reduced to 59, which is only 10.03 mm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> in a 65 nm fine-grained many-core system. In comparison with published AES implementations on general purpose processors, our design has 3.6-10.7 times higher throughput per area. Moreover, the presented design shows 1.5 times higher throughput than the TI DSP C6201 and 3.4 times higher throughput per area than the GeForce 8800 GTX.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call