Abstract

Predicting the success of a business venture has always been a struggle for both practitioners and researchers. However, thanks to companies that aggregate data about other firms, it has become possible to create and validate predictive models based on an unprecedented amount of real-world examples. In this study, we use data obtained from one of the largest platforms integrating business information – Crunchbase. Our final training set consisted of 213 171 companies.This work aims to create a predictive model based on machine learning for the purpose of forecasting a company’s success. Many similar attempts have been made in recent years. Plenty of those experiments, often conducted with the use of data gathered from several different sources, reported promising results. However, we found that very often they were significantly biased by their use of data containing information that was a direct consequence of a company reaching some level of success (or failure). Such an approach is a classic example of the look-ahead bias. It leads to very optimistic test results, but any attempt at using such an approach in a real-world scenario may result in dramatic consequences. We designed our experiments in a way that would prevent the leaking of any information unavailable at the decision moment to the training set.We compared three algorithms – logistic regression, support vector machine, and the gradient boosting classifier. Despite the conscious decision to limit the number of predictors, we reached very promising results in terms of precision, recall, and F1 scores which, for the best model, were 57%, 34%, and 43% respectively. The best outcomes were obtained with the gradient boosting classifier. We give detailed information about the importance of different features, with the top three being country and region that the company operates in and the company’s industry. Our model can be applied directly as a decision support system for different types of venture capital funds.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call