Prehospital electrocardiogram (PH-ECG) transmission is an important technology for reducing door-to-balloon time, but the decision to transmit often depends on the discretion of emergency medical technicians (EMTs). Additionally, studies based on real-world data remain insufficient. This study reports a machine learning-based method for classifying the severity of PH-ECG images and explores its feasibility. PH-ECG data were compiled from 120 patients between September 2017 and September 2020. The model we created from these data was the first classification model for PH-ECG images using data from a Japanese study population and showed a weighted F1-score of 0.85 and an Area Under the Curve (AUC) of 0.93. This result can be interpreted as having an excellent balance of sensitivity and specificity. The Cohen’s Kappa coefficient between AI’s inferences and the correct labels created by two cardiologists was 0.68 (p < 0.05), which is considered “substantial” according to the guidelines presented by Landis and Koch. In this study, although we were not able to remove noise caused by patient movement or electrode detachment, the results indicate that image-based abnormality detection from PH-ECGs is feasible and effective, particularly in regions like Japan where ECG data are often stored and transmitted as images. In addition, in our region, paramedics follow a multi-step process to decide whether to transmit an ECG, which takes time for the first screening. However, if the ECG is transmitted when either the paramedics or the deep learning model detects an abnormality, it is expected to reduce reading time and door-to-balloon time, as well as decrease false negatives.