Reducing register pressure in SMT processors through L2-miss-driven early register release

Joseph J Sharkey,Dmitry V Ponomarev,Jason Loew

doi:10.1145/1455650.1455652

Abstract

The register file is one of the most critical datapath components limiting the number of threads that can be supported on a simultaneous multithreading (SMT) processor. To allow the use of smaller register files without degrading performance, techniques that maximize the efficiency of using registers through aggressive register allocation/deallocation can be considered. In this article, we propose a novel technique to early deallocate physical registers allocated to threads which experience L2 cache misses. This is accomplished by speculatively committing the load-independent instructions and deallocating the registers corresponding to the previous mappings of their destinations, without waiting for the cache miss request to be serviced. The early deallocated registers are then made immediately available for allocation to instructions within the same thread as well as within other threads, thus improving the overall processor throughput. On the average across the simulated mixes of multiprogrammed SPEC 2000 workloads, our technique results in 33% improvement in throughput and 25% improvement in terms of harmonic mean of weighted IPCs over the baseline SMT with the state-of-the-art DCRA policy. This is achieved without creating checkpoints, maintaining per-register counters of pending consumers, performing tag rebroadcasts, register remappings, and/or additional associative searches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Reducing register pressure in SMT processors through L2-miss-driven early register release

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Architecture and Code Optimization

Lead the way for us

Journal: ACM Transactions on Architecture and Code Optimization	Publication Date: Nov 1, 2008
Citations: 4

Similar Papers

An L2-miss-driven early register deallocation for SMT processors
Joseph Sharkey ... Dmitry Ponomarev
-
Joseph Sharkey, et. al.Joseph Sharkey ... Dmitry Ponomarev
17 Jun 2007
17 Jun 2007

Efficient resource sharing algorithm for physical register file in simultaneous multi-threading processors
Yilin Zhang ... Wei-Ming Lin
Microprocessors and Microsystems | VOL. 45
Yilin Zhang, et. al.Yilin Zhang ... Wei-Ming Lin
24 Jun 2016
Microprocessors and Microsystems | VOL. 45

An evaluation of speculative instruction execution on simultaneous multithreaded processors
Steven Swanson ... Susan J Eggers
ACM Transactions on Computer Systems | VOL. 21
Steven Swanson, et. al.Steven Swanson ... Susan J Eggers
01 Aug 2003
ACM Transactions on Computer Systems | VOL. 21

PTSMT: A Tool for Cross-Level Power, Performance, and Thermal Exploration of SMT Processors
Deepa Kannan ... Aviral Shrivastava
-
Deepa Kannan, et. al.Deepa Kannan ... Aviral Shrivastava
01 Jan 2008
01 Jan 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Reducing register pressure in SMT processors through L2-miss-driven early register release

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Architecture and Code Optimization