Mitigating Cold Start Problem in Serverless Computing with Function Fusion.

Sangho Yeo,Daegun Yoon,Sangyoon Oh,Seungjun Lee

doi:10.3390/s21248416

Abstract

As Artificial Intelligence (AI) is becoming ubiquitous in many applications, serverless computing is also emerging as a building block for developing cloud-based AI services. Serverless computing has received much interest because of its simplicity, scalability, and resource efficiency. However, due to the trade-off with resource efficiency, serverless computing suffers from the cold start problem, that is, a latency between a request arrival and function execution. The cold start problem significantly influences the overall response time of workflow that consists of functions because the cold start may occur in every function within the workflow. Function fusion can be one of the solutions to mitigate the cold start latency of a workflow. If two functions are fused into a single function, the cold start of the second function is removed; however, if parallel functions are fused, the workflow response time can be increased because the parallel functions run sequentially even if the cold start latency is reduced. This study presents an approach to mitigate the cold start latency of a workflow using function fusion while considering a parallel run. First, we identify three latencies that affect response time, present a workflow response time model considering the latency, and efficiently find a fusion solution that can optimize the response time on the cold start. Our method shows a response time of 28–86% of the response time of the original workflow in five workflows.

Highlights

Artificial Intelligence (AI) has become a vital part of our everyday life
According to Daw et al, 2020 [8], the cold start latency can account for 46% of the workflow response time if the workflow consists of 5 s functions and up to 90% of the workflow response time if the workflow consists of 500 ms functions
We presented the impact of the cold start problem on a serverless workflow

Summary

Introduction

Artificial Intelligence (AI) has become a vital part of our everyday life. Long before AlphaGo [1] surprised us by beating the top class Go player Se-dol Lee in 2016, AI had already been applied to help us to improve search results [2], enhance recommendation quality of online shopping [3], and build human-like robots [4], among other things. If serverless computing is used on a public cloud instead of on-premises platform, a function has a maximum execution time limit. One early but notable study focused on reducing the cold start latency of a workflow is Daw et al [8], wherein the authors proposed a speculative function preloading method called Xanadu. A preloading schedule (i.e., when and what function should be loaded) is calculated using the profile data with a function execution time, warm start latency, and branch probability. To address the branching and parallel execution issues of the workflow execution in serverless environment, we propose a scheme that reduces the cold start latency of a workflow that contains a branch and a possible parallelism with function fusion technique. We instead focus on correct and efficient execution of a given workflow

Cold Start and Warm Start

Problem Modeling

Graph Representation of a Workflow

Function

Fan-Out

Conditional Branch

Latency

Workflow Response Time Model

Fusion Decision on a Function

Decision for a Fusion on a Fan-Out

Fusion Decision on a Conditional Branch

The Overall Process