Context: A Systems Development Life Cycle (SDLC) is a model of phases-activities, roles, and products systematically used to develop software with functional expected quality. Although SDLC is widely applied to various software types, it remains unusual in Big Data Analytics Systems (BDAS). Objective: To address this issue, several SDLCs for BDAS have been proposed, along with comparative studies, to guide interested organizations in adapting them. This research seeks a lightweight, balanced, and feasible for small development teams or organizations, taking advantage of favorable characteristics of the international ISO standard. Method and Materials: This study describes the knowledge gap by reporting a comparative analysis of four relevant SDLCs. A selective research method was applied (CRISP-DM, TDSP, BDPL, and DDSL), focusing on alignment with the recent ISO/IEC 29110-basic profilestandard. The goal was to identify which SDLC contributes and fits better from a lightweight approach. Results: From the rigorous approach Cross Industry Standard Process for Data Mining (CRISP-DM) showed the highest alignment with the standard, for the agile approach it was Domino Data Science Lifecycle (DDSL) being the closest of the four. Team Data Science Process (TDSP) stood out as the most agile of those analyzed but fell short of the required results. BDPL, which manages another standard, was too rigorous and more distant. Conclusions: Research on new SDLC for Big Data Project Lifecycle (BDPL) has been practically nonexistent in software engineering from 2000 to 2023. Only BDPL was found in the academic literature, while the other three came from gray literature. Despite the relevance of this topic for BDAS organizations, no adequate SDLC was identified
Read full abstract