Breakdown (BD) voltage is significant in high-voltage power electric machines. Currently, BD voltages are mainly predicted by the semi-empirical formula in strongly inhomogeneous electric fields. However, the equation could not be applied for electrodes with weakly inhomogeneous electric fields. In this paper, positive lightning impulse BD voltages are predicted in various sphere-to-plane air gaps using forms of machine learning such as support vector regression (SVR), Bayesian regression (BR) and multilayer perceptron (MLP). Unlike previous studies, a method is also proposed by introducing streamer propagation characteristics as new features and by removing electric field gradients as unnecessary features to find out how to reduce the feature dimension. The streamer propagation characteristics are suggested to reflect the possibility of a discharge process between electrodes. Predicted voltages from machine learning algorithms are compared with the experimental results and calculated voltages from the semi-empirical formula. Firstly, the predictions from each model agreed well with the datasets. New features were observed to be applied for machine learning algorithms and to be as important as known electrostatic features before discharge. Secondly, predicted BD voltages were more accurate than calculated voltages from the semi-empirical equation in strongly inhomogeneous electric fields. Predictions from each model also agreed well with the experimental results in weakly inhomogeneous electric fields. The prediction accuracy of SVR was better than those of BR and MLP. Machine learning algorithms were also shown to be applied for electrodes with a wide range of inhomogeneities, unlike a semi-empirical method. We expect that the suggested features and machine learning algorithms can be used for accurately calculating BD voltages.