Detection of somatic structural variants from short-read next-generation sequencing data.

Tingting Gong,Eva K F Chan,Vanessa M Hayes

doi:10.1093/bib/bbaa056

Abstract

Somatic structural variants (SVs), which are variants that typically impact >50 nucleotides, play a significant role in cancer development and evolution but are notoriously more difficult to detect than small variants from short-read next-generation sequencing (NGS) data. This is due to a combination of challenges attributed to the purity of tumour samples, tumour heterogeneity, limitations of short-read information from NGS and sequence alignment ambiguities. In spite of active development of SV detection tools (callers) over the past few years, each method has inherent advantages and limitations. In this review, we highlight some of the important factors affecting somatic SV detection and compared the performance of seven commonly used SV callers. In particular, we focus on the extent of change in sensitivity and precision for detecting different SV types and size ranges from samples with differing variant allele frequencies and sequencing depths of coverage. We highlight the reasons for why some SV callers perform well in some settings but not others, allowing our evaluation findings to be extended beyond the seven SV callers examined in this paper. As the importance of large SVs become increasingly recognized in cancer genomics, this paper provides a timely review on some of the most impactful factors influencing somatic SV detection that should be considered when choosing SV callers.

Highlights

Cancer is a disease of the genome that develops through the accumulation of somatic mutations, ranging from single nucleotide variants (SNVs), insertions/deletions of a few nucleotides, to large structural variants (SVs) [1]
Structural variants are an important type of genomic alterations in cancer, but are intrinsically more difficult to detect than small variants from short-read next-generation sequencing (NGS) data
Recent studies have attempted to compare the performance of a variety of SV callers, but these have focused predominantly on germline SVs and simple SV types [8,9] and only on overall performance for somatic SVs [10]

Summary

Introduction

Cancer is a disease of the genome that develops through the accumulation of somatic mutations (variants), ranging from single nucleotide variants (SNVs), insertions/deletions (indels) of a few nucleotides, to large structural variants (SVs) [1]. While sequencing of more reads (higher depth of coverage) can sometimes compensate for this, it provides limited advantage at genomic regions with low sequencing complexity (e.g. repetitive sequences) or regions of high sequence similarity (e.g. segmental duplicated regions) These regions can lead to ambiguous read alignments, which are a significant source of false positive variant detection. While increasing sequencing coverage can assist in capturing low abundance tumour SVs, in many cases, it is unclear whether the associated increase in cost can outweigh any information gained [7] These challenges have resulted in the development and refinement of multiple SV detection methods and SV calling software (SV callers) in the last decade, each with their advantages and disadvantages. We evaluate and quantify each SV caller’s ability to detect different SV types and size ranges, the individual and interaction effects of SV abundance and sequencing coverages, their precision in predicting genomic breakpoints, and the impact of sequence similarity (genomic segmental duplications) on somatic SV detection

Methods

Results

Concluding remarks

Key points

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Briefings in bioinformatics	Publication Date: May 7, 2020
Citations: 44	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Detection of somatic structural variants from short-read next-generation sequencing data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Briefings in bioinformatics

Lead the way for us

Similar Papers

Improved Assembly of Metagenome-Assembled Genomes and Viruses in Tibetan Saline Lake Sediment by HiFi Metagenomic Sequencing.
Ye Tao ... Fan Xun
Microbiology Spectrum | VOL. 11
Ye Tao, et. al.Ye Tao ... Fan Xun
08 Dec 2022
Microbiology Spectrum | VOL. 11

Performance of somatic structural variant calling in lung cancer using Oxford Nanopore sequencing technology
Lingchen Liu ... Phan T Nguyen
BMC Genomics | VOL. 25
Lingchen Liu, et. al.Lingchen Liu ... Phan T Nguyen
30 Sep 2024
BMC Genomics | VOL. 25

An assessment of bioinformatics tools for the detection of human endogenous retroviral insertions in short-read genome sequencing data.
Harry Bowles ... Alfredo Iacoangeli
Frontiers in bioinformatics | VOL. 2
Harry Bowles, et. al.Harry Bowles ... Alfredo Iacoangeli
08 Feb 2023
Frontiers in bioinformatics | VOL. 2

Precise characterization of somatic complex structural variations from tumor/control paired long-read sequencing data with nanomonsv.
Yuichi Shiraishi ... Kenichi Chiba
Nucleic Acids Research | VOL. 51
Yuichi Shiraishi, et. al.Yuichi Shiraishi ... Kenichi Chiba
20 Jun 2023
Nucleic Acids Research | VOL. 51

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Detection of somatic structural variants from short-read next-generation sequencing data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Briefings in bioinformatics