Abstract

The task of performance analysis and optimization grows more and more challenging with the increasing scale and complexity of large computing systems. The need for a holistic system analysis becomes apparent when traditional approaches do not collect the information that is required to investigate performance penalties caused by shared system resources. We have developed a distributed approach that is able to collect and process performance data from shared system resources. We call our software implementation of this approach Dataheap and have integrated it with a traditional program tracing facility. In this paper we describe the needs that have driven this development as well as connections to related projects. Dataheap is based on a threaded server, distributed agents that collect performance data, a storage backend that makes use of different databases, and access libraries that allow external systems to retrieve current and historic performance data. The server subsequently processes incoming performance data and allows to create secondary metrics on the fly which helps to transform individual system characteristics to standard performance metrics. Finally, we briefly illustrate how this approach has enhanced our performance debugging capabilities as well as our research on energy effcient computing.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.