Reducing False-Positive Results in Newborn Screening Using Machine Learning.

Gang Peng,Curt Scharfe,Yishuo Tang,Hongyu Zhao,Tina M Cowan,Gregory M Enns

doi:10.3390/ijns6010016

Gang Peng, Curt Scharfe + Show 4 more

Open Access

PDF Available

https://doi.org/10.3390/ijns6010016

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Newborn screening (NBS) for inborn metabolic disorders is a highly successful public health program that by design is accompanied by false-positive results. Here we trained a Random Forest machine learning classifier on screening data to improve prediction of true and false positives. Data included 39 metabolic analytes detected by tandem mass spectrometry and clinical variables such as gestational age and birth weight. Analytical performance was evaluated for a cohort of 2777 screen positives reported by the California NBS program, which consisted of 235 confirmed cases and 2542 false positives for one of four disorders: glutaric acidemia type 1 (GA-1), methylmalonic acidemia (MMA), ornithine transcarbamylase deficiency (OTCD), and very long-chain acyl-CoA dehydrogenase deficiency (VLCADD). Without changing the sensitivity to detect these disorders in screening, Random Forest-based analysis of all metabolites reduced the number of false positives for GA-1 by 89%, for MMA by 45%, for OTCD by 98%, and for VLCADD by 2%. All primary disease markers and previously reported analytes such as methionine for MMA and OTCD were among the top-ranked analytes. Random Forest’s ability to classify GA-1 false positives was found similar to results obtained using Clinical Laboratory Integrated Reports (CLIR). We developed an online Random Forest tool for interpretive analysis of increasingly complex data from newborn screening.

Highlights

Newborn screening (NBS) using tandem mass spectrometry (MS/MS) has transformed our ability to identify and provide early, lifesaving treatment to infants with hereditary metabolic diseases
Without changing the sensitivity of first-tier screening for these disorders, Random Forest (RF) reduced the number of false positives by 89% for glutaric acidemia type 1 (GA-1), 45% for methylmalonic acidemia (MMA), 98% for ornithine transcarbamylase deficiency (OTCD) and by 2% for very long-chain acyl-CoA dehydrogenase deficiency (VLCADD) (Table 1)
MS/MS screening identifies most infants with a metabolic disorder on the Recommended Universal Screening Panel (RUSP), it creates a high number of false positives that require additional confirmatory testing of all screenpositive cases

Summary

Introduction

Newborn screening (NBS) using tandem mass spectrometry (MS/MS) has transformed our ability to identify and provide early, lifesaving treatment to infants with hereditary metabolic diseases. Additional biochemical and DNA testing of all screen-positive cases is performed to confirm (true positive) or reject (false positive) the primary screening result and to reach a final diagnosis. In some cases, this two-tier strategy can lead to iterative testing rounds and diagnostic delays, placing undue burden on the healthcare system including physicians and clinical laboratories, and on the patients and their families. As an alternative approach to analyte cutoffs, Clinical Laboratory Integrated Reports (CLIR, formerly R4S) postanalytical testing employs a large database of dynamic reference ranges for disease-related analytes and many additional informative analyte ratios in order to improve separation of true- and false-positive cases [2,3,4,5]. The ranges and overlap of analyte values between patient and control groups can be adjusted in CLIR for multiple continuous and clinical variables (e.g., birth weight, sex, age at blood collection), which have been shown to significantly reduce false-positive results [6]

Methods

Results

Conclusion