Abstract
Call path profiling is a scalable measurement technique that has been shown to provide insight into the performance characteristics of complex modular programs. However, poor presentation of accurate and precise call path profiles obscures insight. To enable rapid analysis of an execution's performance bottlenecks, we make the following contributions for effectively presenting call path profiles. First, we combine a relatively small set of complementary presentation techniques to form a coherent synthesis that is greater than the constituent parts. Second, we extend existing presentation techniques to rapidly focus an analyst's attention on performance bottlenecks. In particular, we (1) show how to scalably present three complementary views of calling-context-sensitive metrics; (2) treat a procedure's static structure as first-class information with respect to both performance metrics and constructing views; (3) enable construction of a large variety of user-defined metrics to assess performance inefficiency; and (4) automatically expand hot paths based on arbitrary performance metrics - through calling contexts and static structure - to rapidly highlight important program contexts. Our work is implemented within HPCToolkit, which collects call path profiles using low-overhead asynchronous sampling.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.