Abstract
Despite significant advances in data management systems in recent decades, the processing of big data at scale remains very challenging. While cloud computing has been well-accepted as a solution to address scalability needs, cloud configuration and operation complexity persist and often present themselves as entry barriers, especially for novice data analysts. Serverless computing and Function-as-a-Service (FaaS) platforms have been suggested to reduce such entry barriers by shifting configuration and operational responsibilities from the application developer to the FaaS platform provider. Naturally, “serverless data processing (SDP)”, that is, using FaaS for (big) data processing, has received increasing interest in recent years.However, FaaS platforms were never intended to support large data processing tasks primarily. SDP, therefore, manifests itself through workarounds and adaptations on the application level, addressing various quirks and limitations of the FaaS platforms in use for data processing needs. This, in turn, creates tensions between the platforms and the applications using them, again encouraging the constant (re-)design of both. Consequently, we present lessons learned from a series of application and platform re-designs that address these tensions, leading to the development of an SDP reference architecture and a platform instantiation and implementation thereof called CREW. Mitigating the tensions through the process of application platform co-design proves to reduce both entry barriers and costs significantly. In some experiments, CREW outperforms traditional, non-SDP big data processing frameworks by factors.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.