Abstract 2081: Nano2NGS: A framework for converting nanopore sequencing data to NGS-liked sequencing data for data analysis

Jidong Lang

doi:10.1158/1538-7445.am2023-2081

Jidong Lang

https://doi.org/10.1158/1538-7445.am2023-2081

Copy DOI

Export

Save

Cite

Journal: Cancer Research

Publication Date: Apr 4, 2023

Abstract
Full-Text
Similar Papers

Abstract

Listen

Abstract Background: Nanopore sequencing, also known as single molecule real-time sequencing, enables deciphering single DNA/RNA molecules without the polymerase chain reaction (PCR). Although nanopore sequencing has made significant progress in some areas, its application has been limited compared with next-generation sequencing (NGS) due to specific design principle and data characteristics, especially in hotspot mutation detection, metagenomic analysis, short tandem repeat (STR) detection and human leukocyte antigen (HLA) typing. Methods: The Nano2NGS framework included 4 modules, namely Nano2NGS-Muta, -Meta, -STR and -HLA. For hotspot mutation detection, we tested Nano2NGS-Muta + Freebayes using simulated data including 12 hotspot mutations (SNVs and InDels), and standard sample including 7 hotspot mutations with ~5% mutation frequency. For metagenomic analysis, we tested Nano2NGS-Meta + Kraken2 using simulated data including 20 bacterial species from the NCBI database, and standard sample including 7 bacteria with ~14% theoretical microbial composition. For STR detection, we tested Nano2NGS-STR using simulated data including 4 STR markers (DYS392, DYS438, DYS448, and DYS635), and standard sample including 44 STR markers from the intersection of two commercial products. For HLA typing, we tested Nano2NGS-HLA + Athlates using 1 simulated data and 1 standard sample. The GM12878 gDNA sample was used as negative control. All samples have 3 experimental repetitive. The nanopore sequencing was performed on the QNome-9604/3841 instrument according to the manufacturer’s instructions. Results: The performance of all 4 modules in simulated datasets was perfect. In standard sample datasets, for hotspot mutation detection, we found the repeated experiments detection sensitivity was 95.24%, 100.00% and 85.71%, respectively. The limit of detection (LoD) of Nano2NGS-Muta + Freebayes may be 2% or even 5% according to the negative dataset. For metagenomic analysis, we found the overall performance of Nano2NGS-Meta + Kraken2 was better, and R-square value compared with expected relative abundance was 0.9357. For STR detection, we found Nano2NGS-STR achieved the best performance compared with Tandem-Genotypes and TriCoLOR, with 71.97% (9948) and 53.03% (2800M) concordance, respectively. For HLA typing, we found the result of Nano2NGS-HLA + Athlates was HLA-A*02/HLA-A*11, HLA-B*35/HLA-B*39 and HLA-C*07/HLA-C*12, which were consistent with the expected genotypes of allele group (field 1). Conclusions: Nano2NGS framework converts nanopore sequencing data into NGS-liked short-read data and can be compatible with the NGS data analysis software. Importantly, Nano2NGS breaks the barriers of data analysis methods between short-read sequencing and long-read sequencing, and accelerates the application of nanopore sequencing technology in scientific research and clinical practice. Citation Format: Jidong Lang. Nano2NGS: A framework for converting nanopore sequencing data to NGS-liked sequencing data for data analysis [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 2081.

Full Text