Noninvasive prenatal test (NIPT) reduces the associated risk of procedure-related miscarriage. However, due to accuracy, special fetuses, economic and policy gaps, NIPT still cannot replace traditional surgical methods. Developing a pipeline with low cost, low technical difficulty, stability and high accuracy is a major challenge for NIPT to be widely used. This study proposes a new pipeline for the detection of fetal trisomy which includes 3 steps: 1. 40 bp single-end sequencing, 2. PchrN calculations, and 3. logisticregression (LR) models. Part of the public dataset (100 out of 144 samples) was used to train models and select features in the machine learning pipeline. 314 samples from different sources were used for independent testing. We compare the performance of our method with the bioinformatics method widely used today. Our model shows high robustness to data from different sources. The final best model achieved an AUC of 99.85 % in predicting T21 using chr21 features which are the DNA fragment concentrations. The AUC is 99.50 %, and 97.70 % in predicting T18 and T13 with all features from 24 chromosomes. The PPV of T21, T18 and T13 was predicted to be 91.67 %, 93.33 % and 83.33 %, respectively, which was higher than that obtained by standard bioinformatics methods. The NPV to identify T21, T18, and T13 were 100 %, 99.33 %, and 98.70 %, respectively. Our approach does not need to calculate fetal fraction (FF) and can handle samples from a wide range of gestational ages (GA), twin pregnancies and fetal mosaicism. Our approach can achieve comparable accuracy with the current standard bioinformatics analysis in low-depth sequencing data. This convenient pipeline can be used independently of traditional bioinformatics methods, and its performance has been tested in real clinical practice. Our pipeline can be an important aid for the detection of fetal trisomy in clinical NIPT, which will help further popularize NIPT.
Read full abstract