Abstract

Simultaneous multithreading (SMT) is becoming one of the major trends in the design of future generations of microarchitectures. Its key strength comes from its ability to exploit both thread-level and instruction-level parallelism; it uses hardware resources efficiently. Nevertheless, SMT has its limitations: contention between threads may cause conflicts; lack of scalability, additional pipeline stages, and inefficient handling of long latency operations. Alternatively, chip multiprocessors (CMP) are highly scalable and easy to program. On the other hand, they are expensive and suffer from cache coherence and memory consistency problems. This paper proposes a microarchitecture that exploits parallelism at instruction, thread, and processor levels. It merges both concepts of SMT and CMP. Like CMP, multiple cores are used on a single chip. Hardware resources are replicated in each core except for the secondary-level cache which is shared among all cores. The processor applies the SMT technique within each core to make full use of available hardware resources. Moreover, the communication overhead is reduced due to the inter-independence between cores. Results show that the proposed microarchitecture outperforms both SMT and CMP. In addition, resources are more evenly distributed among running threads

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call