Abstract

Abstract Simultaneous Multithreading improves performance of superscalar CPUs by allowing execution of multiple threads with a shared path. An improved instruction throughput is attained by better utilizing shared resources from exploiting the newly available thread-level parallelism in addition to the intrinsic instruction-level parallelism. Physical Register file is one of the most critical shared resources in SMT systems due to the limited number of rename registers available for renaming. Registers held by long-latency instructions of some threads will block the progress of other faster threads resulting in inefficient resource utilization and performance degradation. In this paper, we present an algorithm with which each thread is allotted a portion of rename registers (i.e. a cap) in real time according to their run-time behaviors, namely the utilization ratio of its allotted quota and the pace of its deallocation. The proposed method differs from the state-of-the-art capping techniques in allowing each thread to adjust its own individual cap value in real time. To preclude over-adjustment, a global lower limit on the cap values is further established also in real time to accommodate potentially drastic variations from different mixes of on-going threads. The proposed method shows a very significant improvement in IPC up to 53.8% in a 4-threaded system, 43.8% and 41.6% in a 6-threaded and an 8-threaded system respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call