Abstract

Monitoring is an important aspect of designing and maintaining large-scale systems. Cloud computing presents a unique set of challenges to monitoring including: on-demand infrastructure, unprecedented scalability, rapid elasticity and performance uncertainty. There are a wide range of monitoring tools originating from cluster and high-performance computing, grid computing and enterprise computing, as well as a series of newer bespoke tools, which have been designed exclusively for cloud monitoring. These tools express a number of common elements and designs, which address the demands of cloud monitoring to various degrees. This paper performs an exhaustive survey of contemporary monitoring tools from which we derive a taxonomy, which examines how effectively existing tools and designs meet the challenges of cloud monitoring. We conclude by examining the socio-technical aspects of monitoring, and investigate the engineering challenges and practices behind implementing monitoring strategies for cloud computing.

Highlights

  • Monitoring large-scale distributed systems is challenging and plays a crucial role in virtually every aspect of a software orientated organisation

  • With no physical infrastructure and a propensity for scale and change it is critical that stakeholders employ a monitoring strategy which allows for the detection of problems, optimisation, cost forecasting, intrusion detection, auditing and other use cases

  • This paper has exhaustively detailed the wide range of monitoring tools related to cloud monitoring

Read more

Summary

Introduction

Monitoring large-scale distributed systems is challenging and plays a crucial role in virtually every aspect of a software orientated organisation. A Nagios service check consists of obtaining the relevant data from the monitored host and checking that value against a expected value or range of values; raising an alert if an unexpected value is detect This simple configuration does not scale well, as the single server becomes a significant bottleneck as the pool of monitored servers grows. Collectd Collectd [39] is an open source tool for collecting monitoring state which is highly extensible and supports all common applications, logs and output formats It is used by many cloud providers as part of their own monitoring solutions, including Rightscale [40]. At scale, processing and storing these variables at this interval requires a significant volume of compute capacity

Motivation
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call