Abstract

Open-source software (OSS) plays an increasingly significant role in modern software development tendency, so accurate prediction of the future development of OSS has become an essential topic. The behavioral data of different open-source software are closely related to their development prospects. However, most of these behavioral data are typical high-dimensional time series data streams with noise and missing values. Hence, accurate prediction on such cluttered data requires the model to be highly scalable, which is not a property of traditional time series prediction models. To this end, we propose a temporal autoregressive matrix factorization (TAMF) framework that supports data-driven temporal learning and prediction. Specifically, we first construct a trend and period autoregressive model to extract trend and period features from OSS behavioral data, and then combine the regression model with a graph-based matrix factorization (MF) to complete the missing values by exploiting the correlations among the time series data. Finally, use the trained regression model to make predictions on the target data. This scheme ensures that TAMF can be applied to different types of high-dimensional time series data and thus has high versatility. We selected ten real developer behavior data from GitHub for case analysis. The experimental results show that TAMF has good scalability and prediction accuracy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call