The aim of this study was to summarize the literature on the applications of machine learning (ML) and their performance in Emergency Medical Services (EMS). Four relevant electronic databases were searched (from inception through January 2024) for all original studies that employed EMS-guided ML algorithms to enhance the clinical and operational performance of EMS. Two reviewers screened the retrieved studies and extracted relevant data from the included studies. The characteristics of included studies, employed ML algorithms, and their performance were quantitively described across primary domains and subdomains. This review included a total of 164 studies published from 2005 through 2024. Of those, 125 were clinical domain focused and 39 were operational. The characteristics of ML algorithms such as sample size, number and type of input features, and performance varied between and within domains and subdomains of applications. Clinical applications of ML algorithms involved triage or diagnosis classification (n = 62), treatment prediction (n = 12), or clinical outcome prediction (n = 50), mainly for out-of-hospital cardiac arrest/OHCA (n = 62), cardiovascular diseases/CVDs (n = 19), and trauma (n = 24). The performance of these ML algorithms varied, with a median area under the receiver operating characteristic curve (AUC) of 85.6%, accuracy of 88.1%, sensitivity of 86.05%, and specificity of 86.5%. Within the operational studies, the operational task of most ML algorithms was ambulance allocation (n = 21), followed by ambulance detection (n = 5), ambulance deployment (n = 5), route optimization (n = 5), and quality assurance (n = 3). The performance of all operational ML algorithms varied and had a median AUC of 96.1%, accuracy of 90.0%, sensitivity of 94.4%, and specificity of 87.7%. Generally, neural network and ensemble algorithms, to some degree, out-performed other ML algorithms. Triaging and managing different prehospital medical conditions and augmenting ambulance performance can be improved by ML algorithms. Future reports should focus on a specific clinical condition or operational task to improve the precision of the performance metrics of ML models.