Abstract
Modern network and computing infrastructures are tasked with addressing the stringent demands of today’s applications. A pivotal concern is the minimization of latency experienced by end-users accessing services. While emerging network architectures provide a conducive setting for adept orchestration of microservices in terms of reliability, self-healing and resiliency, assimilating the awareness of the latency perceived by the user into placement decisions remains an unresolved problem. Current research addresses the problem of minimizing inter-service latency without any guarantee to the level of latency from the end-user to the cluster. In this research, we introduce an architectural approach for scheduling service workloads within a given cluster, prioritizing placement on the node that offers the lowest perceived latency to the end-user. To validate the proposed approach, we propose an implementation on Kubernetes (K8s), currently one of the most used workload orchestration platforms. Experimental results show that our approach effectively reduces the latency experienced by the end-user in a finite time without degrading the quality of service. We study the performance of the proposed approach analyzing different parameters with a particular focus on the size of the cluster and the number of replica pods involved to measure the latency. We provide insights on possible trade-offs between computational costs and convergence time.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.