Abstract

Human papillomavirus (HPV) is a causal agent for most cervical cancers. The physical status of the HPV genome in these cancers could be episomal, integrated, or both. HPV integration could serve as a biomarker for clinical diagnosis, treatment, and prognosis. Although whole-genome sequencing by next-generation sequencing (NGS) technologies, such as the Illumina sequencing platform, have been used for detecting integrated HPV genome in cervical cancer, it faces challenges of analyzing long repeats and translocated sequences. In contrast, Oxford nanopore sequencing technology can generate ultra-long reads, which could be a very useful tool for determining HPV genome sequence and its physical status in cervical cancer. As a proof of concept, in this study, we completed whole genome sequencing from a cervical cancer tissue and a CaSki cell line with Oxford Nanopore Technologies. From the cervical cancer tissue, a 7,894 bp-long HPV35 genomic sequence was assembled from 678 reads at 97-fold coverage of HPV genome, sharing 99.96% identity with the HPV sequence obtained by Sanger sequencing. A 7904 bp-long HPV16 genomic sequence was assembled from data generated from the CaSki cell line at 3857-fold coverage, sharing 99.99% identity with the reference genome (NCBI: U89348). Intriguingly, long reads generated by nanopore sequencing directly revealed chimeric cellular–viral sequences and concatemeric genomic sequences, leading to the discovery of 448 unique integration breakpoints in the CaSki cell line and 60 breakpoints in the cervical cancer sample. Taken together, nanopore sequencing is a unique tool to identify HPV sequences and would shed light on the physical status of HPV genome in its associated cancers.

Highlights

  • Human papillomavirus (HPV), a double-stranded circular DNA virus, is a causal agent for most cervical cancers (Zur Hausen, 2002; Abu-Lubad et al, 2020) and is associated with anal cancer (Alemany et al, 2015), oropharyngeal cancer (Meng et al, 2020), and vaginal cancer (Hellman et al, 2004)

  • To further validate the assembled HPV35 from nanopore sequencing, 17 segments that cover the whole HPV genome were amplified from the cervical cancer tissue sample and sequenced with Sanger sequencing (Figure 2D)

  • From the whole genome sequencing data generated by nanopore sequence technology, the assembly of HPV genome with high accuracy can be achieved with our bioinformatic strategy

Read more

Summary

Introduction

Human papillomavirus (HPV), a double-stranded circular DNA virus, is a causal agent for most cervical cancers (Zur Hausen, 2002; Abu-Lubad et al, 2020) and is associated with anal cancer (Alemany et al, 2015), oropharyngeal cancer (Meng et al, 2020), and vaginal cancer (Hellman et al, 2004). In HPV-related cancers, the physical status of HPV genome has been found to be episomal, integrated, or mixed (Park et al, 1997; Nambaru et al, 2009; Niya et al, 2019). The CaSki cells, a naturally derived cervical carcinoma cell line, contain a high number of concatemeric HPV genomic sequences inserted within cellular genome (Yee et al, 1985) and focal genomic variations at the integration locus (McBride and Warburton, 2017). Short-reads (100–500 bp) generated by NGS can lead to errors and ambiguity in mapping viral integration and assembling repetitive sequences (Alkan et al, 2011; Treangen and Salzberg, 2011)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call