Abstract

Congenital heart disease (CHD) is a common birth defect in children. Intelligent auscultation algorithms have been proven to reduce the subjectivity of diagnoses and alleviate the workload of doctors. However, the development of this algorithm has been limited by the lack of reliable, standardized, and publicly available pediatric heart sound databases. Therefore, the objective of this research is to develop a large-scale, high-standard, high-quality, and accurately labeled pediatric congenital heart disease (CHD) heart sound database, and perform classification tasks to evaluate its performance, filling this important research gap. From 2020 to 2022, we collaborated with experienced cardiac surgeons from Zhejiang University Children's Hospital to collect heart sound signals from 1259 participants using electronic stethoscopes. To ensure accurate disease diagnosis, the cardiac ultrasound images for each participant were acquired by an experienced ultrasonographer, and the final diagnosis was confirmed through the consensus of two cardiac experts or cardiac surgeons. To establish the benchmark of ZCHSound, we extracted 84 time-frequency features from the heart sounds and evaluated the performance of the classification task using machine learning models. Additionally, we evaluated the importance scores of the 84 features in distinguishing between normal and pathological heart sounds in children using SHapley Additive exPlanations (SHAP) values. The ZCHSound database contains heart sound data from 1259 participants, with all data divided into two datasets: one is a high-quality, filtered clean heart sound dataset, and the other is a low-quality, noisy heart sound dataset. In the evaluation of the high-quality dataset, our random forest ensemble model achieved an F1 score of 90.3% in the classification task of normal and pathological heart sounds. Moreover, the SHAP analysis results demonstrate that frequency-domain features have a more significant impact on the model output compared to time-domain features. Features related to the cardiac diastolic period have a greater influence on the model's classification results compared to those related to the systolic period. This study has successfully established a large-scale, high-quality, rigorously standardized pediatric CHD sound database with precise disease diagnosis. This database not only provides important learning resources for clinical doctors in auscultation knowledge but also offers valuable data support for algorithm engineers in developing intelligent auscultation algorithms. Our data can be accessed and downloaded by the public at http://zchsound.ncrcch.org.cn/.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call