Assessing team software development projects is notoriously difficult and typically based on subjective metrics. To help make assessments more rigorous, we conducted an empirical study to explore relationships between subjective metrics based on peer and instructor assessments, and objective metrics based on GitHub and chat data. We studied 23 undergraduate software teams ( n = 117 students) from two undergraduate computing courses at two North American research universities. We collected data on teams’ (a) commits and issues from their GitHub code repositories, (b) chat messages from their Slack and Microsoft Teams channels, (c) peer evaluation ratings from the CATME peer evaluation system, and (d) individual assignment grades from the courses. We derived metrics from (a) and (b) to measure both individual team members’ contributions to the team, and the equality of team members’ contributions. We then performed Pearson analyses to identify correlations among the metrics, peer evaluation ratings, and individual grades. We found significant positive correlations between team members’ GitHub contributions, chat contributions, and peer evaluation ratings. In addition, the equality of teams’ GitHub contributions was positively correlated with teams’ average peer evaluation ratings and negatively correlated with the variance in those ratings. However, no such positive correlations were detected between the equality of teams’ chat contributions and their peer evaluation ratings. Our study extends previous research results by providing evidence that (a) team members’ chat contributions, like their GitHub contributions, are positively correlated with their peer evaluation ratings; (b) team members’ chat contributions are positively correlated with their GitHub contributions; and (c) the equality of team’ GitHub contributions is positively correlated with their peer evaluation ratings. These results lend further support to the idea that combining objective and subjective metrics can make the assessment of team software projects more comprehensive and rigorous.
Read full abstract