Performance Comparison of Serial and Parallel Multipliers in Massively Parallel Environment

Shilpa Mayannavar,Uday Wali

doi:10.1109/iceeccot43722.2018.9001528

Abstract

Computational environment of Deep Learning Neural Networks (DLNNs) is considerably different than that of Conventional computer systems. DLNNs require thousands, if not millions of compute cores compared to one or few in conventional systems. Therefore, there is a need to review the performance issue to gain better understanding of how systems behave in such massively parallel architectures. Precision, speed, memory access, bus contention, resource sharing, chip area etc are some of the key issues that need to be studied in the changed context. Low precision multiplication remains one of the commonly used operations in neural computations. This paper draws reader attention to some interesting results in area-speed tradeoffs when applied to massively parallel architectures. A new low precision fixed point representation is discussed. A hardware accelerators and its software components used in the simulation are briefly discussed. Results show that serial multipliers can perform better than parallel multipliers considering the throughput per unit area of Silicon.

Full Text