Abstract

In recent years, the so-called long-read sequencing technology has had a substantial impact on various aspects of genome sciences. Here, we introduce recent studies of cancerous structural variants (SVs) using long-read sequencing technologies, namely Pacific Biosciences (PacBio) sequencers, Oxford Nanopore Technologies (ONT) sequencers, and linked-read methods. By taking advantage of long-read lengths, these technologies have enabled the precise detection of SVs, including long insertions by transposable elements, such as LINE-1. In addition to SV detection, the epigenome status (including DNA methylation and haplotype information) surrounding SV loci has also been unveiled by long-read sequencing technologies, to identify the effects of SVs. Among the various research fields in which long-read sequencing has been applied, cancer genomics has shown the most remarkable advances. In fact, many studies are beginning to shed light on the detection of SVs and the elucidation of their complex structures in various types of cancer. In the particular case of cancers, we summarize the technical limitations of the application of this technology to the analysis of clinical samples. We will introduce recent achievements from this viewpoint. However, a similar approach will be started for other applications in the near future. Therefore, by complementing the current short-read sequencing analysis, long-read sequencing should reveal the complex nature of human genomes in their healthy and disease states, which will open a new opportunity for a better understanding of disease development and for a novel strategy for drug development.

Highlights

  • In recent years, the so-called long-read sequencing technology has had a substantial impact on various aspects of genome sciences

  • For researchers of technical development, we further summarize some limitations of the recent long-read sequencing projects, namely, 1) the huge amount of input DNA required, 2) error-prone sequencing outputs, 3) presence of several genomic aberrations in cancer genomes that are too large to allow cover, even by long reads, 4) challenges in visualizing complicated genome structures, and 5) bias from reference-dependent structural variants (SVs) detection

  • We demonstrated that the presence of Cancerous Local Copy-number Lesions (CLCLs) led to aberrant transcription of RNA and affected the function of the proteins produced by the genes involved in them

Read more

Summary

Studies of SV in human cancer genomes using long-read sequencing

There are two broad categories of computational methods for detecting SVs from long-read data: mapping-based methods and de novo assembly-based methods. 2020 2020 regions, which comprise highly repetitive sequences and represent ambiguous bases in the current human reference genome They detected the amplification of cancer-related genes, such as EGFR, PDGFRA, and CDK4. Aganezov et al, who work in the same research group as Nattestad and colleagues, performed deep, whole-genome sequencing of a breast cancer cell line and two breast cancer clinical samples using ONT PromethION, PacBio, 10X linked-read sequencing, and Illumina sequencing, to detect and characterize SVs precisely [55]. Fujimoto et al tried to construct a catalog of polymorphic and somatic SVs from long-read sequencing data based on ONT MinION sequencing of 11 Japanese liver cancers that had been previously sequenced by the International Cancer Genome Consortium [59] For this purpose, they developed a new analytical pipeline called CAMPHOR. We are convinced that several very complicated SVs, such as CLCLs, play important roles in tumorigenesis and/or cancer progression, and that these SVs need to be precisely identified using long-read sequencing technologies

Transposable elements and SVs
DNA methylation and SVs
Haplotype phasing and SVs
Findings
Summary and outlook
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call