Successful investment in early-stage companies has high uncertainty. More specifically, the tools available to investors need to be more robust to reduce the risk and manage the uncertainty of startups. This research aims to use machine learning methods to design a prediction solution to identify successful startups for investors. In order to design the predicting solution and provide policies, classification, and clustering algorithms have been utilized to predict the success of startups and perform feature importance analysis based on the SHAP and permutation methods. Subsequently, the performance of four classification algorithms, such as Random Forest, Gradient Boost, Multilayer Perceptron, Logistic Regression and Support Vector Machine, are compared to predict business success. Meanwhile, Random Forest and Gradient Boosting algorithms showed the best accuracy, which was equal to 82% and 80%, respectively. Based on the feature importance of Random Forest and Gradient Boosting, which is obtained from the SHAP method, indicated that the higher values of “Number of followers on LinkedIn”, “Number of employees on LinkedIn”, “Number of followers on Twitter”, and “Last raised amount” have higher SHAP values and a more significant impact on the model output. Three clustering algorithms including hierarchy, K-means, and DBSCAN were also compared. Among them, the K-means algorithm performs best with 72% silhouette, and K-means was employed to explain each cluster’s characteristics. Finally, an effective artificial intelligence-based prediction solution has been proposed to show the way for investors to apply machine learning concepts to predict the success of startups.
Read full abstract