A hardware mechanism for dynamic extraction and relayout of program hot spots

Matthew C Merten,Wen-Mei W Hmu,Erik M Nystrom,Ronald D Barnes,Andrew R Trick

doi:10.1145/339647.339655

Abstract

This paper presents a new mechanism for collecting and deploying runtime optimized code. The code-collecting component resides in the instruction retirement stage and lays out hot execution paths to improve instruction fetch rate as well as enable further code optimization. The code deployment component uses an extension to the Branch Target Buffer to migrate execution into the new code without modifying the original code. No significant delay is added to the total execution of the program due to these components. The code collection scheme enables safe runtime optimization along paths that span function boundaries. This technique provides a better platform for runtime optimization than trace caches, because the traces are longer and persist in main memory across context switches. Additionally, these traces are not as susceptible to transient behavior because they are restricted to frequently executed code. Empirical results show that on average this mechanism can achieve better instruction fetch rates using only 12KB of hardware than a trace cache requiring 15KB of hardware, while producing long, persistent traces more suited to optimization.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A hardware mechanism for dynamic extraction and relayout of program hot spots

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A hardware mechanism for dynamic extraction and relayout of program hot spots
Matthew C Merten ... Ronald D Barnes
ACM SIGARCH Computer Architecture News | VOL. 28
Matthew C Merten, et. al.Matthew C Merten ... Ronald D Barnes
01 May 2000
ACM SIGARCH Computer Architecture News | VOL. 28

Optimising power efficiency in trace cache fetch unit
J Hu ... M Kandemir
IET Computers & Digital Techniques | VOL. 1
J Hu, et. al.J Hu ... M Kandemir
02 Jul 2007
IET Computers & Digital Techniques | VOL. 1

An architectural framework for runtime optimization
M.C Merten ... R.D Barnes
IEEE Transactions on Computers | VOL. 50
M.C Merten, et. al.M.C Merten ... R.D Barnes
01 Jun 2001
IEEE Transactions on Computers | VOL. 50

Using dynamic branch behavior for power-efficient instruction fetch
J.S Hu ... M Kandemir
-
J.S Hu, et. al.J.S Hu ... M Kandemir
11 Jan 2017
11 Jan 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A hardware mechanism for dynamic extraction and relayout of program hot spots

Abstract

Talk to us

Similar Papers