Performance and hardware complexity tradeoffs in designing multithreaded architectures

M Bekerman,G Sheaffer,A Mendelson

doi:10.1109/pact.1996.552552

Abstract

This paper presents performance and step-bp-step complexity analysis of two different design alternatives of multithreaded architecture: dynamic inter-thread resource scheduling and static resource allocation. We show that with two concurrent threads the dynamic scheduling processor achieves from 5 to 45% higher performance at the cost of much more complicated design. The paper shows that for a relatively high number of execution resources the complexity of the dynamic scheduling logic will inevitably require design compromises. Moreover, high chip-wide communication time and an incomplete bypassing network will force the dynamic scheduling to use static-like execution unit assignment, thus reducing its performance advantage. At the same transistor budget the static architecture may implement additional functional units, resulting in better overall performance.

Full Text