Predicting health indicators for open source projects (using hyperparameter optimization)

Tianpei Xia,Rui Shu,Tim Menzies,Wei Fu,Rishabh Agrawal

doi:10.1007/s10664-022-10171-0

Abstract

Software developed on public platform is a source of data that can be used to make predictions about those projects. While the individual developing activity may be random and hard to predict, the developing behavior on project level can be predicted with good accuracy when large groups of developers work together on software projects. To demonstrate this, we use 64,181 months of data from 1,159 GitHub projects to make various predictions about the recent status of those projects (as of April 2020). We find that traditional estimation algorithms make many mistakes. Algorithms like k-nearest neighbors (KNN), support vector regression (SVR), random forest (RFT), linear regression (LNR), and regression trees (CART) have high error rates. But that error rate can be greatly reduced using hyperparameter optimization. To the best of our knowledge, this is the largest study yet conducted, using recent data for predicting multiple health indicators of open-source projects. To facilitate open science (and replications and extensions of this work), all our materials are available online at https://github.com/arennax/Health_Indicator_Prediction .

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Predicting health indicators for open source projects (using hyperparameter optimization)

Abstract

Talk to us

Similar Papers

More From: Empirical Software Engineering

Lead the way for us

Journal: Empirical Software Engineering	Publication Date: Jun 22, 2022
Citations: 8

Similar Papers

The Classification Performance and Mechanism of Machine Learning Algorithms in Winter Wheat Mapping Using Sentinel-2 10 m Resolution Imagery
Peng Fang ... Yuanzheng Wang
Applied Sciences | VOL. 10
Peng Fang, et. al.Peng Fang ... Yuanzheng Wang
23 Jul 2020
Applied Sciences | VOL. 10

Usporedba metoda umjetne inteligencije za predviđanje tlačne čvrstoće betona
-
Journal of the Croatian Association of Civil Engineers | VOL. 73
--
01 Jul 2021
Journal of the Croatian Association of Civil Engineers | VOL. 73

AIR QUALITY PREDICTION USING MACHINE LEARNING ALGORITHMS
Ponnaganti Sirisha ... Kommuri Jagannadh
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT | VOL. 07
Ponnaganti Sirisha, et. al.Ponnaganti Sirisha ... Kommuri Jagannadh
14 Mar 2023
INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT | VOL. 07

Estimating PM10 Concentration from Drilling Operations in Open-Pit Mines Using an Assembly of SVR and PSO
Xuan-Nam Bui ... Hoang Nguyen
Applied Sciences | VOL. 9
Xuan-Nam Bui, et. al.Xuan-Nam Bui ... Hoang Nguyen
12 Jul 2019
Applied Sciences | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Predicting health indicators for open source projects (using hyperparameter optimization)

Abstract

Talk to us

Similar Papers

More From: Empirical Software Engineering