Services provided by modern organizations are usually designed, deployed, and supported by large-scale IT infrastructures. In order to obtain the best performance out of these services, it is essential that organizations enforce rational practices for the management of the resources that compose their infrastructures. A common point in most guides and libraries of best practices for IT management – such as ITIL or COBIT – is the explicit concern with the risks related to IT activities. Proactively dealing with adverse and favorable events that may arise during everyday operations might prevent, for example: delay on deployment of services, cost overrun in activities, predictable failures of handled resources, and, consequently, waste of money. Although important, risk management in practice usually lacks in automation and standardization in IT environments. Therefore, in this article, we introduce a framework to support the automation of some key steps of risk management. Our goal is to organize risk information related to IT activities providing support for decision making thus turning risk response planning simpler, faster, and more accurate. The proposed framework is targeted to workflow-based IT management systems. The fundamental approach is to learn from problems reported in the history of previously conducted workflows in order to estimate risks for future executions. We evaluated the applicability of the framework in two case studies both in IT related areas, namely: IT change management and IT project management. The results show how the framework is not only useful to speed up the risk assessment process, but also to assist the decision making of project managers and IT operators by organizing risk detailed information in a comprehensive way.
Read full abstract