The virtualization concept and elasticity feature of cloud computing enable users to request resources on-demand and in the pay-as-you-go model. However, the high flexibility of the model makes the on-time resource scaling problem more complex. A variety of techniques such as threshold-based rules, time series analysis, or control theory are utilized to increase the efficiency of dynamic scaling of resources. However, the inherent dynamicity of cloud-hosted applications requires autonomic and adaptable systems that learn from the environment in real-time. Reinforcement Learning (RL) is a paradigm that requires some agents to monitor the surroundings and regularly perform an action based on the observed states. RL has a weakness to handle high dimensional state space problems. Deep-RL models are a recent breakthrough for modeling and learning in complex state space problems. In this article, we propose a Hybrid Anomaly-aware Deep Reinforcement Learning-based Resource Scaling (ADRL) for dynamic scaling of resources in the cloud. ADRL takes advantage of anomaly detection techniques to increase the stability of decision-makers by triggering actions in response to the identified anomalous states in the system. Two levels of global and local decision-makers are introduced to handle the required scaling actions. An extensive set of experiments for different types of anomaly problems shows that ADRL can significantly improve the quality of service with less number of actions and increased stability of the system.
Read full abstract