Abstract
Existing performance management tools for large-scale distributed web services detect anomalies in the performance metric behavior by thresholding on the metrics, which often leads to high false alarm rates, is hard to interpret, and misses multimodal performance behavior. We provide an information-theoretic approach to detecting anomalies in the metric behavior by taking into account the temporal and spatial relationships among the metrics. We model the metrics using a parametric mixture distribution such that each component of the mixture represents a homogeneous segment of temporally contiguous metric behavior. We discover the number, parameters and (temporal) locations the segments (i.e., mixture components) by minimizing an information-theoretic relative entropy between the mixture model and the unknown, underlying distribution of the metrics. We then cluster the discovered segments based on the statistical distances between them to detect any anomalous performance behavior and modes of typical metric behavior.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.