The object of this research is to develop a data-efficient pipeline for the detection of atrial septal defects (ASDs) using echocardiographic images. ASDs are common congenital heart defects that can lead to serious health issues if not diagnosed early. Rising mortality rates due to undetected ASDs highlight the urgent need for improved diagnostic methods. To address the problem of limited annotated medical data hindering accurate detection models, this study fine-tuned the SegFormer model for precise segmentation of cardiac structures in echocardiography images, focusing on the four-chamber heart view essential for ASD detection. By integrating SegFormer with the YOLOv7 detection model, known for real-time object detection, the ASD regions within the segmented heart structures were accurately identified. This cross-referencing ensures anatomically accurate diagnoses and reduces false positives. The study results demonstrate that despite limited data, the integrated method achieves high accuracy and speed, outperforming traditional models. This improvement is explained by the synergy between SegFormer’s transformer-based segmentation and YOLOv7’s efficient detection capabilities. The distinctive feature of our approach is the successful integration of these models in a data-efficient manner, enabling effective ASD detection even with scarce data. The scope of practical use includes deployment in clinical settings with limited resources, requiring only echocardiographic equipment and basic computational resources. By providing clinicians with a reliable tool for ASD detection, the study supports timely interventions in pediatric cardiology, ultimately improving patient outcomes and enhancing care consistency