Abstract

This paper proposes a method of buffering instructions by software-based prefetching. The method allows low-end processors to improve their instruction throughput with a minimum of additional logic and power consumption. Low-end embedded processors do not employ caches for mainly two reasons. The first reason is that the overhead of cache implementation in terms of energy and area is considerable. The second reason is that, because a cache's performance primarily depends on the number of hits, an increasing number of misses could cause a processor to remain in stall mode for a longer duration. As a result, a cache may become more of a liability than an advantage. In contrast, the benchmarked results for the proposed software-based prefetch buffering without a cache show a 5–10% improvement in execution time. They also show a 4% or more reduction in the energy-delay-square-product (ED2P) with a maximum reduction of 40%. The results additionally demonstrate that the performance and efficiency of the proposed architecture scales with the number of multicycle instructions. The benchmarked routines tested to arrive at these results are widely deployed components of embedded applications.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call