Abstract
For performance analysis tools to be useful, they need to show the relation of detected bottlenecks to source code. To this end, it often makes sense to use the instruction triggering a problematic event. However for cache line utilization, information on usage is only available at eviction time, but may be better attributed to the instruction which loaded the line. Such attribution is impossible with current processor hardware. Callgrind, a cache simulator part of the open-source Valgrind tool, can do this. However, it only provides Self Costs. In this paper, we extend the cost attribution of cache use metrics to inclusive costs which helps for top-down analysis of complex workloads. The technique can be used for all event types where collected metrics should to be attributed to instructions executing earlier in a program run to be useful.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.