Abstract

As Artificial Intelligence (AI) is becoming ubiquitous in many applications, serverless computing is also emerging as a building block for developing cloud-based AI services. Serverless computing has received much interest because of its simplicity, scalability, and resource efficiency. However, due to the trade-off with resource efficiency, serverless computing suffers from the cold start problem, that is, a latency between a request arrival and function execution. The cold start problem significantly influences the overall response time of workflow that consists of functions because the cold start may occur in every function within the workflow. Function fusion can be one of the solutions to mitigate the cold start latency of a workflow. If two functions are fused into a single function, the cold start of the second function is removed; however, if parallel functions are fused, the workflow response time can be increased because the parallel functions run sequentially even if the cold start latency is reduced. This study presents an approach to mitigate the cold start latency of a workflow using function fusion while considering a parallel run. First, we identify three latencies that affect response time, present a workflow response time model considering the latency, and efficiently find a fusion solution that can optimize the response time on the cold start. Our method shows a response time of 28–86% of the response time of the original workflow in five workflows.

Highlights

  • Artificial Intelligence (AI) has become a vital part of our everyday life

  • According to Daw et al, 2020 [8], the cold start latency can account for 46% of the workflow response time if the workflow consists of 5 s functions and up to 90% of the workflow response time if the workflow consists of 500 ms functions

  • We presented the impact of the cold start problem on a serverless workflow

Read more

Summary

Introduction

Artificial Intelligence (AI) has become a vital part of our everyday life. Long before AlphaGo [1] surprised us by beating the top class Go player Se-dol Lee in 2016, AI had already been applied to help us to improve search results [2], enhance recommendation quality of online shopping [3], and build human-like robots [4], among other things. If serverless computing is used on a public cloud instead of on-premises platform, a function has a maximum execution time limit. One early but notable study focused on reducing the cold start latency of a workflow is Daw et al [8], wherein the authors proposed a speculative function preloading method called Xanadu. A preloading schedule (i.e., when and what function should be loaded) is calculated using the profile data with a function execution time, warm start latency, and branch probability. To address the branching and parallel execution issues of the workflow execution in serverless environment, we propose a scheme that reduces the cold start latency of a workflow that contains a branch and a possible parallelism with function fusion technique. We instead focus on correct and efficient execution of a given workflow

Cold Start and Warm Start
Problem Modeling
Graph Representation of a Workflow
Function
Fan-Out
Conditional Branch
Latency
Workflow Response Time Model
Fusion Decision on a Function
Decision for a Fusion on a Fan-Out
Fusion Decision on a Conditional Branch
The Overall Process
Writing a Fused Function
Example Workflows
Matrix Multiplication
Related Works
Reducing the Cost of Serverless Computing by Function Fusion
Workflows Management and Optimizations for Serverless Computing
Findings
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call