Abstract

Nowadays, continuous integration (CI) is indispensable in the software development process. A central promise of adopting CI is that new features or bug fixes can be delivered more quickly. A recent repository mining study by Bernardo, da Costa & Kulesza (2018) found that only about half of the investigated open source projects actually deliver pull requests (PR) faster after adopting CI, with small effect sizes. However, there are some concerns regarding the methodology used by Bernardo et al., which may potentially limit the trustworthiness of this finding. Particularly, they do not explicitly control for normal changes in the pull request delivery time during a project’s lifetime (independently of CI introduction). Hence, in our work, we conduct a conceptual replication of this study. In a first step, we replicate their study results using the same subjects and methodology. In a second step, we address the same core research question using an adapted methodology. We use a different statistical method (regression discontinuity design, RDD) that is more robust towards the confounding factor of projects potentially getting faster in delivering PRs over time naturally, and we introduce a control group of comparable projects that never applied CI. Finally, we also evaluate the generalizability of the original findings on a set of new open source projects sampled using the same methodology. We find that the results of the study by Bernardo et al. largely hold in our replication. Using RDD, we do not find robust evidence of projects getting faster at delivering PRs without CI, and we similarly do not see a speed-up in our control group that never introduced CI. Further, results obtained from a newly mined set of projects are comparable to the original findings. In conclusion, we consider the replication successful.

Highlights

  • Continuous Integration (CI) is a popular practice in the software community (Duvall, Matyas & Glover, 2007)

  • To extend the original study methodology, and address the concerns we have with the experimental methodology as initially proposed, we investigate two different aspects: RQ2.1: Can similar results to the original study be found when controlling for changes in pull requests (PR) delivery time over the lifetime of a project? To answer this question, we apply Regression Discontinuity Design –RDD (Thistlethwaite & Campbell, 1960), a statistical method that allowed us to evaluate whether there is a trend of PR delivery times over time, and whether this trend changes significantly when CI is introduced

  • New data For collecting a new data set, we largely follow the process originally used by Bernardo, da Costa & Kulesza (2018), which in turn was inspired by Vasilescu et al (2015)

Read more

Summary

Introduction

Continuous Integration (CI) is a popular practice in the software community (Duvall, Matyas & Glover, 2007). CI helps developers integrate changes frequently in a collaborative manner. As a distributed and cooperative practice, CI is commonly used in both, commercial and open source software (OSS) development. Ståhl & Bosch (2014) claim that integrators tend to release more frequently after adopting CI. We present important background on CI and the pull request based development model. CI and the pull request based development model. CI promises manifold benefits, such as quickening the delivery of new functionalities (Laukkanen, Paasivaara & Arvonen, 2015), reducing problems of code integration in a collaborative environment (Vasilescu et al, 2014), guaranteeing the stability of the code in the mainline. CI has found widespread practitioner adoption (Hilton et al, 2016), making it a relevant subject of academic study

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call