Clustering Performance Anomalies Based on Similarity in Processing Time Changes

Satoshi Iwata,Kenji Kono

doi:10.2197/ipsjtrans.5.1

Abstract

Performance anomalies in web applications are becoming a huge problem and the increasing complexity of modern web applications has made it much more difficult to identify their root causes. The first step toward hunting for root causes is to narrow down suspicious components that cause performance anomalies. However, even this is difficult when several performance anomalies simultaneously occur in a web application; we have to determine if their root causes are the same or not. We propose a novel method that helps us narrow down suspicious components called performance anomaly clustering, which clusters anomalies based on their root causes. If two anomalies are clustered together, they are affected by the same root cause. Otherwise, they are affected by different root causes. The key insight behind our method is that anomaly measurements that are negatively affected by the same root cause deviate similarly from standard measurements. We compute the similarity in deviations from the non-anomalous distribution of measurements, and cluster anomalies based on this similarity. The results from case studies, which were conducted using RUBiS, which is an auction prototype modeled after eBay.com, are encouraging. Our clustering method output clusters crucial in the search for root causes. Guided by the clustering results, we searched for components exclusively used by each cluster and successfully determined suspicious components, such as the Apache web server, Enterprise Beans, and methods in Enterprise Beans. The root causes we found were shortages in network connections, inadequate indices in the database, and incorrect issues with SQLs, and so on.

Full Text