Abstract
There are renewed interests recently in data integrity error detection and localization driven by exponentially growing data volumes over large-scale networked systems. Most existing RCA (Root Cause Analysis) systems take an infrastructure operator's view and rely on dedicated and expensive monitoring capabilities to instrument and facilitate the analysis. Unfortunately, in our targeted wide area network environment, complete network information and monitoring capability are normally lacking. In this paper, we present a RCA system that leverages the end-to-end flow monitoring information from the application layer, augmented by limited network information. We demonstrated that root cause localization with high accuracy can be obtained using multi-class classification models. We specifically studied the impacts of different realistic combinations of features based on the available yet incomplete information from both application and network layers.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have