Abstract

As big data spreads rapidly, performance problems in these systems become common concerns. As the first line of defending these problems, performance diagnosis plays an essential role in big data systems. It is notoriously difficult to conduct performance diagnosis in large distributed systems. Previous work either pinpoint the root causes by instrumenting the applications or runtime systems in a white-box way, which leads to a considerable overhead, or just provide some hints to the hidden root causes in a black-box way. Very few works focus on pinpointing the real root causes in a black-box way. To address this problem, this paper proposes a black-box invariant-based diagnosing approach and implements a proof-of-concept system named InvarNet-X . In this paper, performance diagnosis is formalized as a pattern recognition problem, meaning that each performance problem is identified by a specific pattern. The rationale of InvarNet-X is that the unobservable root causes of performance problems always expose themselves through the violations of the associations among directly observable performance metrics. Such observable associations are called likely invariants calculated by the maximal information criterion, and each performance problem is signified by a sparse distributed representation. A problem signature database is constructed by training multiple real performance problems in advance. Once a performance anomaly is detected, the diagnosing procedure is triggered. The root cause is pinpointed by retrieving similar signatures in the signature database. The experimental evaluations in a controlled big data system show that InvarNet-X can achieve a high accuracy in diagnosing some real performance problems reported in software bug repositories, which is superior to several state-of-the-art approaches. Moreover, the light-weight property makes InvarNet-X easily facilitated in large-scale big data systems in real time.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.