Data tagging architecture for system monitoring in dynamic environments

Bharat Krishnamurthy,Anindya Neogi,Bikram Sengupta,Raghavendra Singh

doi:10.1109/noms.2008.4575160

Abstract

Large enterprise systems need continuous monitoring at infrastructure, application and business levels to detect and prevent problem situations. Traditionally, automated monitoring solutions are programmed once at setup based on a set of well-defined monitoring objectives and handed over to the operations team. Such solutions have underlying data models that are often complex and semantically rich but in stable environments, this complexity is generally hidden from the operations team, who only need to make minor configuration changes (e.g. setting thresholds) as and when required. However, the situation is now rapidly changing with enterprise data centers being subject to continuous transformations as new software, hardware and process components get deployed or updated. This puts an immense burden on monitoring activity because not only thousands of different parameters need to get monitored but the addition and modification of service level objectives (SLOs) may happen continuously. We describe a monitoring system architecture which simplifies the task of authoring and managing SLOs in such dynamic and heterogeneous environments. At the heart of our approach is a lightweight and extensible data model that is derived from more complex configuration models, so as to only expose data relevant for monitoring to the operations team. Simple string-tags derived from this model are then used to label SLOs and associated data streams. The approach localizes programming to the data-sensor layer and makes authoring simpler than the specification of objects in an alternate richer but complex object-oriented representation. We also describe a tag-driven real-time visualization tool that can organize data streams using their accompanying tags and ease user navigation through large volumes of monitoring data.

Full Text