Abstract

This paper studied the distribution of different types of changes in the various contexts of the system and the relationship between artifact (file and module) size and different changes. We used the change data in the open source Eclipse Project through its decade-long evolution history. The latest release has 220 modules, 33904 files, 3780201 lines of code, and 49853 changes (accumulatively). This study focused on two levels of software artifacts: module and file; and four contexts of changes: all changes, error changes, non-error changes, and 19 change categories. At the module level, we found that the power-law distribution was a common phenomenon for three contexts of changes at both the module and file levels: it existed in all changes, in error changes, and in non-error changes. When we analyzed the 19 change categories, the files and modules exhibited different behavior: the power-law distribution existed in all but one category at the module level, but, about two third of the change categories did not show the power-law distribution at the file level. On the relationship between artifact size and changes, we found, at the module level, a few modules that had the majority of changes accounted for the majority of the code size; however, this phenomenon disappeared when we separated the er- ror from non-error changes. At the file level, this phenomenon did not exist at all. We did not find any correlation between artifact size and changes at either the module or file level.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call