Improved Q Network Auto-Scaling in Microservice Architecture

Yeonggwang Kim,Junchurl Yoon,Jaehyung Park,Jinsul Kim

doi:10.3390/app12031206

Yeonggwang Kim, Junchurl Yoon + Show 2 more

Open Access

PDF Available

https://doi.org/10.3390/app12031206

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Microservice architecture has emerged as a powerful paradigm for cloud computing due to its high efficiency in infrastructure management as well as its capability of largescale user service. A cloud provider requires flexible resource management to meet the continually changing demands, such as auto-scaling and provisioning. A common approach used in both commercial and open-source computing platforms is workload-based automatic scaling, which expands instances by increasing the number of incoming requests. Concurrency is a request-based policy that has recently been proposed in the evolving microservice framework; in this policy, the algorithm can expand its resources to the maximum number of configured requests to be processed in parallel per instance. However, it has proven difficult to identify the concurrency configuration that provides the best possible service quality, as various factors can affect the throughput and latency based on the workloads and complexity of the infrastructure characteristics. Therefore, this study aimed to investigate the applicability of an artificial intelligence approach to request-based auto-scaling in the microservice framework. Our results showed that the proposed model could learn an effective expansion policy within a limited number of pods, thereby showing an improved performance over the underlying auto expansion configuration.

Highlights

With the advancement and spread of Virtual Machine and container technology, the adoption of microservice computing models has increased substantially [1]
Existing research shows that the use of different workloads can affect microservice performance and cause latency differences up to a few seconds. Since this can have a significant impact on user experience, we proposed a reinforcement learning RL-based model that dynamically determines the optimal concurrency for individual workloads
The results demonstrated the dependence of throughput as well as delay time during overload work; they showed the possibility of improvement through adaptive expansion setting

Summary

Introduction

With the advancement and spread of Virtual Machine and container technology, the adoption of microservice computing models has increased substantially [1]. Microservice computing provides two main benefits to users: First, it is easy to scale when in use, as it is not difficult to develop idle VMs or containers. Some microservice frameworks use a resource-based Kubernetes Horizontal Pod Autoscaler (HPA) to drive expansion through CPU or memory utilization thresholds per instance. Available microservice platforms are often characterized by workload-based expansion, where they provide additional resources as the incoming traffic increases. Creating new requests in this way leads to a speed delay problem, because it is processed using the HPA method. To minimize this problem, one may use an open-source framework that supports parallel processing up to a predefined number of simultaneous requests per instance [4]. This can lead to effective learning because it can be close to the Q Network with such a fixed target network [37,38,39,40].

Objectives

Methods

Results