To assess the diagnostic accuracy of two-dimensional ultrasound at 11-14 weeks' gestation as a screening test for individual fetal anomalies and to identify factors impacting on screening performance. This was a systematic review and meta-analysis that was developed and registered with PROSPERO (CRD42018111781). MEDLINE, EMBASE, Web of Science Core Collection and the Cochrane Library were searched for studies evaluating the diagnostic accuracy of screening for 16 predefined, non-cardiac, congenital anomalies considered to be of interest to the early anomaly scan. We included prospective and retrospective studies from any healthcare setting conducted in low-risk, mixed-risk and unselected populations. The reference standard was the detection of an anomaly on postnatal or postmortem examination. Data were extracted to populate 2 × 2 tables and a random-effects model was used to determine the diagnostic accuracy of screening for the predefined anomalies (individually and as a composite). Secondary analyses were performed to determine the impact on detection rates of imaging protocol, type of ultrasound modality, publication year and index of sonographer suspicion at the time of scanning. Post-hoc secondary analysis was conducted to assess performance among studies published during or after 2010. Risk of bias assessment and quality assessment were undertaken for included studies using the Quality Assessment of Diagnostic Accuracy Studies-2 tool. From 5684 citations, 202 papers underwent full-text review, resulting in the inclusion of 52 studies comprising 527 837 fetuses, of which 2399 were affected by one or more of the 16 predefined anomalies. Individual anomalies were not equally amenable to detection on first-trimester ultrasound: a high (> 80%) detection rate was reported for severe conditions, including acrania (98%), gastroschisis (96%), exomphalos (95%) and holoprosencephaly (88%); the detection rate was lower for open spina bifida (69%), lower urinary tract obstruction (66%), lethal skeletal dysplasias (57%) and limb-reduction defects (50%); and the detection rate was below 50% for facial clefts (43%), polydactyly (40%) and congenital diaphragmatic hernia (38%). Conditions with a low (< 30%) detection rate included bilateral renal agenesis (25%), closed spina bifida (21%), isolated cleft lip (14%) and talipes (11%). Specificity was > 99% for all anomalies. Secondary analysis showed that detection improved with advancing publication year, and that the use of imaging protocols had a statistically significant impact on screening performance (P < 0.0001). The accurate detection of congenital anomalies using first-trimester ultrasound is feasible, although detection rates and false-positive rates depend on the type of anomaly. The use of a standardized protocol allows for diagnostic performance to be maximized, particularly for the detection of spina bifida, facial clefts and limb-reduction defects. Highlighting the types of anomalies amenable to diagnosis and determining factors enhancing screening performance can support the development of first-trimester anomaly screening programs. © 2024 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd on behalf of International Society of Ultrasound in Obstetrics and Gynecology.