Abstract

Function-as-a-Service (FaaS) is a popular programming model for building serverless applications, supported by all major cloud providers and many open-source software frameworks. One of the main challenges for FaaS providers is providing fault tolerance for the deployed applications, that is, providing the ability to mask failures of function invocations from clients. The basic fault tolerance approach in current FaaS platforms is automatically retrying function invocations. Although the retry approach is well suited for transient failures, it incurs delays in recovering from other types of failures, such as node crashes. This paper proposes the integration of a Request Replication mechanism in FaaS platforms and describes how this integration was implemented in Fission, a well-known, open-source platform. It provides a detailed experimental comparison of the proposed approach with the retry approach and an Active-Standby approach in terms of performance, availability, and resource consumption under different failure scenarios.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call