BackgroundPneumonia is one of the most common complications after spontaneous intracerebral hemorrhage (sICH), namely stroke associated pneumonia (SAP). Timely identification of targeted patients is beneficial to reduce poor prognosis. So far, there is no consensus on SAP prediction, and application of existing predictors is limited. The aim of the study is to develop a machine learning model to predict SAP after sICH. MethodsWe retrospectively reviewed 748 patients diagnosed with sICH and collected their data from four dimensions including demographic features, clinical features, medical history, and laboratory tests. Five machine learning algorithms including logistic regression, gradient boosting decision tree, random forest, extreme gradient boosting, and category boosting were used to build and validate the predictive model. And we applied recursive feature elimination with cross-validation to obtain the best feature combination for each model. The predictive performance was evaluated by the areas under the receiver operating characteristic curves (AUC). ResultsA total of 237 patients were diagnosed as SAP. The model developed by category boosting yielded the most satisfied outcomes overall with its AUC in training set and test set were 0.8307 and 0.8178, respectively. ConclusionsThe incidence of SAP after sICH in our center was 31.68%. Machine learning could provide assistance potentially in the prediction of SAP after sICH.