Abstract

Anomaly diagnosis is vital to the performance of online transaction processing (OLTP) systems. In the meanwhile, machine learning techniques can reason complex relationships beyond human abilities and perform well on such problems. However, they rely on a large number of training samples for anomalies, which are in serious shortage in both industry and academia due to the difficulty of collection. The problem raises the demand of a benchmark for anomaly reproduction and data collection. In this paper, we propose DBPA, a benchmark for transactional database performance anomalies. Specifically, we identify nine common anomalies rooted in the diverse influence factors. For each anomaly, we carefully design a reproduction procedure, which consists with its root cause in real-world databases. With the reproduction procedures, users can easily generate a dataset in a new environment and extend new anomaly types. For compound anomalies, we provide a generation algorithm that allows users to generate compound anomalies data of any possible combinations with existing collected data. We also provide a large dataset of both normal and anomalous monitoring data collected from various environments, facilitating the training of machine learning models and the evaluation of new algorithms for anomaly diagnosis.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call