Instruction prefetching of systems codes with layout optimized for reduced cache misses

Chun Xia,Josep Torrellas

doi:10.1145/232974.233001

Abstract

High-performing on-chip instruction caches are crucial to keep fast processors busy. Unfortunately, while on-chip caches are usually successful at intercepting instruction fetches in loop-intensive engineering codes, they are less able to do so in large systems codes. To improve the performance of the latter codes, the compiler can be used to lay out the code in memory for reduced cache conflicts. Interestingly, such an operation leaves the code in a state that can be exploited by a new type of instruction prefetching: guarded sequential prefetching.The idea is that the compiler leaves hints in the code as to how the code was laid out. Then, at run time, the prefetching hardware detects these hints and uses them to prefetch more effectively. This scheme can be implemented very cheaply: one bit encoded in control transfer instructions and a prefetch module that requires minor extensions to existing next-line sequential prefetchers. Furthermore, the scheme can be turned off and on at run time with the toggling of a bit in the TLB. The scheme is evaluated with simulations using complete traces from a 4-processor machine. Overall, for 16-Kbyte primary instruction caches, guarded sequential prefetching removes, on average, 66% of the instruction misses remaining in an operating system with an optimized layout, speeding up the operating system by 10%. Moreover, the scheme is more cost-effective and robust than existing sequential prefetching techniques.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Instruction prefetching of systems codes with layout optimized for reduced cache misses

Abstract

Talk to us

Similar Papers

More From: ACM SIGARCH Computer Architecture News

Lead the way for us

Journal: ACM SIGARCH Computer Architecture News	Publication Date: May 1, 1996
Citations: 6

Similar Papers

Instruction prefetching of systems codes with layout optimized for reduced cache misses
Chun Xia ... Josep Torrellas
-
Chun Xia, et. al.Chun Xia ... Josep Torrellas
01 May 1996
01 May 1996

Instruction cache prefetching using multilevel branch prediction
Alexander V Veidenbaum
-
Alexander V VeidenbaumAlexander V Veidenbaum
01 Jan 1997
01 Jan 1997

A continuous reload on-chip instruction cache for low-end RISC
Maki ... Shigenaga
-
Maki, et. al. Maki ... Shigenaga
01 Jan 1992
01 Jan 1992

NON-SEQUENTIAL INSTRUCTION CACHE PREFETCHING FOR MULTIPLE–ISSUE PROCESSORS
Alexander V Veidenbaum ... Qingbo Zhao
International Journal of High Speed Computing | VOL. 10
Alexander V Veidenbaum, et. al.Alexander V Veidenbaum ... Qingbo Zhao
01 Mar 1999
International Journal of High Speed Computing | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Instruction prefetching of systems codes with layout optimized for reduced cache misses

Abstract

Talk to us

Similar Papers

More From: ACM SIGARCH Computer Architecture News