Abstract
The MonALISA (Monitoring Agents in A Large Integrated Services Architecture) framework provides a set of distributed services for monitoring, control, management and global optimization for large scale distributed systems. It is based on an ensemble of autonomous, multi-threaded, agent-based subsystems which are registered as dynamic services. They can be automatically discovered and used by other services or clients. The distributed agents can collaborate and cooperate in performing a wide range of management, control and global optimization tasks (such as network monitoring, resource accounting) using real time monitoring information. MonALISA includes a coherent set of network management services to collect in near real-time information about the network topology, the main data flows, traffic volume and the quality of connectivity. A set of dedicated modules were developed in the MonALISA framework to periodically perform network measurements tests between all sites. We developed global services to present in near real-time the entire network topology used by a community. The time evolution of global network topology is shown in a dedicated GUI. Changes in the global topology at this level occur quite frequently and even small modifications in the connectivity map may significantly affect the network performance. The global topology graphs are correlated with active end-to-end network performance measurements, done using the Fast Data Transfer application, between all sites. Access to both real-time and historical data, as provided by MonALISA, is also important for developing services able to predict the usage pattern, to aid in efficiently allocating resources globally. For resource accounting, MonALISA collects information regarding the amounts of resources consumed by the users, which represent virtual organizations in a large scale distributed system. Besides providing statistical information, an accounting system can also be the base for managing distributed resources upon an economic model. In the MonALISA monitoring framework we developed modules that provide accounting facilities, collecting information from cluster managers like Condor, PBS, LSF and SGE. The usage statistic s is used for an intelligent management of the resources.
Highlights
An important part of managing global-scale systems is a monitoring system that is able to monitor and track in real time many site facilities, networks, and tasks in progress
MonALISA, which stands for Monitoring Agents using a Large Integrated Services Architecture, is a monitoring framework designed as an ensemble of dynamic services, able to collaborate and cooperate in performing a wide range of information gathering and processing tasks
We present a set of services developed in the context of the MonALISA framework for monitoring and controlling large scale networks, as an extension of the work previously presented in [2]
Summary
An important part of managing global-scale systems is a monitoring system that is able to monitor and track in real time many site facilities, networks, and tasks in progress. The monitoring information gathered is essential for developing the required higher level services, the components that provide decision support and some degree of automated decisions and for maintaining and optimizing workflow in large scale distributed systems (LSDS). These management and global optimization functions are performed by higher level agent-based services. The monitoring framework has to intelligently collect, in a LSDS environment, a large number of monitoring events that are generated by the system components during the execution or interaction with external objects (such as users or processes) Monitoring such events is necessary for observing the run-time behavior of the large scale distributed system and for providing status information required for debugging, tuning and managing processes.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.