Abstract

Microservice architecture has emerged as a popular pattern for developing large-scale applications for its benefits of flexibility, scalability, and agility. However, the large number of services and complex dependencies make it difficult and time-consuming to diagnose performance issues. We propose Micro-Diag, an automated system to localize root causes of performance issues in microservice systems at a fine granularity, including not only locating the faulty component but also discovering detailed information for its abnormality. MicroDiag constructs a component dependency graph and performs causal inference on diverse anomaly symptoms to derive a metrics causality graph, which is used to infer root causes. Our experimental evaluation on a microservice benchmark running in a Kubernetes cluster shows that MicroDiag localizes root causes well, with 97% precision of the top 3 most likely root causes, outperforming state-of-the-art methods by at least 31.1%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call