From May 2019, GitHub launched sponsor mechanism indicating that GitHub is moving towards deeper integration of open source development and economic support. It will bring more comprehensive and diversified support to the open source community. However, the number of developers profiting from the sponsor mechanism follows a long tail distribution. Our study found that only 31% of developers who started the sponsor mechanism received rewards, and 39.3% of them only received a reward of one dollar. Our work focuses on identifying what factors affect the availability of sponsorship for developers in open source community. We start by defining 45 features to characterize the developers in four dimensions i.e. Personality, Advertisement, Repository and Behavior. The results of statistical analysis indicate that most of the proposed features differ significantly between the ones who received rewards (short for MTs_Yes) from those that are not. After that, we build machine learning model based on the proposed features to predict MTs_Yes. Compared with the existing work, results show that our method outperforms baselines by 30% for AUC (Area Under the Curve). In addition, we investigated the relative contribution of features in detecting MTs_Yes and analyzed the important features by using an interpretable model SHAP. Finally, based on the experimental results, we put forward corresponding and practical suggestions for developers who want to receive rewards so as to make the community of open source projects develop more harmonious.
Read full abstract