Abstract

A readme file plays an important role in a GitHub repository to provide a starting point for developers to reuse and make contributions. A good readme could provide sufficient information for users to learn and start a GitHub repository and might be correlated to the popularity of a repository. Given the importance of the role that a readme file plays, we aim to study to understand the correlation between the readme file of GitHub repositories and their popularity. We analyze readme files of 5,000 GitHub repositories across more than 20 languages. We study the relationship between readme file related factors and the popularity of GitHub repositories. We observe that: (1) Most of the studied readme file related factors (e.g., the number of lists, the number and frequency of updates on the readme file) are statistically significantly different between popular and non-popular repositories with non-negligible effect size. (2) After controlling repository-specific factors (e.g., repository topics and license information), the number of lists and the frequency of updates are the most significantly important factors that discriminate between popular and non-popular repositories. (3) The most of updates were made to update references in popular repositories, while in non-popular repositories most updates are for the content of how to use the repository.Editor’s note: Open Science material was validated by the Journal of Systems and Software Open Science Board.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call