Artificial intelligence in the service of system administrators

C Haen,E Bonaccorsi,N Neufeld,V Barra

doi:10.1088/1742-6596/396/5/052038

C Haen, E Bonaccorsi + Show 2 more

Open Access

https://doi.org/10.1088/1742-6596/396/5/052038

Copy DOI

Journal: Journal of Physics: Conference Series	Publication Date: Dec 13, 2012
Citations: 2	License type: cc-iop-open

Affiliation: Institut Pascal

Abstract

The LHCb online system relies on a large and heterogeneous IT infrastructure made from thousands of servers on which many different applications are running. They run a great variety of tasks: critical ones such as data taking and secondary ones like web servers. The administration of such a system and making sure it is working properly represents a very important workload for the small expert-operator team. Research has been performed to try to automatize (some) system administration tasks, starting in 2001 when IBM defined the so-called “self objectives” supposed to lead to “autonomic computing”. In this context, we present a framework that makes use of artificial intelligence and machine learning to monitor and diagnose at a low level and in a non intrusive way Linux-based systems and their interaction with software. Moreover, the multi agent approach we use, coupled with an “object oriented paradigm” architecture should increase our learning speed a lot and highlight relations between problems.

Full Text