ObjectiveTo develop a deep learning algorithm to perform multi-class classification of normal pediatric heart sounds, innocent murmurs, and pathologic murmurs. MethodsWe prospectively enrolled children under age 18 being evaluated by the Division of Pediatric Cardiology. Parents provided consent for a deidentified recording of their child's heart sounds with a digital stethoscope. Innocent murmurs were validated by a pediatric cardiologist and pathologic murmurs were validated by echocardiogram. To augment our collection of normal heart sounds, we utilized a public database of pediatric heart sound recordings (Oliveira, 2022). We propose two novel approaches for this audio classification task. We train a vision transformer on either Markov transition field or Gramian angular field image representations of the frequency spectrum. We benchmark our results against a ResNet-50 CNN trained on spectrogram images. ResultsOur final dataset consisted of 366 normal heart sounds, 175 innocent murmurs, and 216 pathologic murmurs. Innocent murmurs collected include Still's murmur, venous hum, and flow murmurs. Pathologic murmurs included ventricular septal defect, tetralogy of Fallot, aortic regurgitation, aortic stenosis, pulmonary stenosis, mitral regurgitation and stenosis, and tricuspid regurgitation. We find that the Vision Transformer consistently outperforms the ResNet-50 on all three image representations, and that the Gramian angular field is the superior image representation for pediatric heart sounds. We calculated a one-vs-rest multi-class ROC curve for each of the three classes. Our best model achieves an area under the curve (AUC) value of 0.92 ± 0.05, 0.83 ± 0.04, and 0.88 ± 0.04 for identifying normal heart sounds, innocent murmurs, and pathologic murmurs, respectively. ConclusionWe present two novel methods for pediatric heart sound classification, which outperforms the current standard of using a convolutional neural network trained on spectrogram images. To our knowledge, we are the first to demonstrate multi-class classification of pediatric murmurs. Multiclass output affords a more explainable and interpretable model, which can facilitate further model improvement in the downstream model development cycle and enhance clinician trust and therefore adoption.