CLOVE: classification of genomic fusions into structural variation events

Jan Schröder,Bertil Schmidt,Anthony T Papenfuss,Adrianto Wirawan

doi:10.1186/s12859-017-1760-3

Abstract

BackgroundA precise understanding of structural variants (SVs) in DNA is important in the study of cancer and population diversity. Many methods have been designed to identify SVs from DNA sequencing data. However, the problem remains challenging because existing approaches suffer from low sensitivity, precision, and positional accuracy. Furthermore, many existing tools only identify breakpoints, and so not collect related breakpoints and classify them as a particular type of SV. Due to the rapidly increasing usage of high throughput sequencing technologies in this area, there is an urgent need for algorithms that can accurately classify complex genomic rearrangements (involving more than one breakpoint or fusion).ResultsWe present CLOVE, an algorithm for integrating the results of multiple breakpoint or SV callers and classifying the results as a particular SV. CLOVE is based on a graph data structure that is created from the breakpoint information. The algorithm looks for patterns in the graph that are characteristic of more complex rearrangement types. CLOVE is able to integrate the results of multiple callers, producing a consensus call.ConclusionsWe demonstrate using simulated and real data that re-classified SV calls produced by CLOVE improve on the raw call set of existing SV algorithms, particularly in terms of accuracy.CLOVE is freely available from http://www.github.com/PapenfussLab.

Highlights

A precise understanding of structural variants (SVs) in DNA is important in the study of cancer and population diversity
We investigate the sensitivity, precision, and accuracy statistics, which calls for the classic contingency tables of true positives (TP), false positives (FP), false negatives (FN)
To introduce rearrangements into the sequence context, we compare the read data to a slightly distant reference strain: E. coli K12. This causes a number of relative genomic rearrangements in the donor genome on which we can test the effectiveness of CLOVE

Summary

Introduction

A precise understanding of structural variants (SVs) in DNA is important in the study of cancer and population diversity. Many methods have been designed to identify SVs from DNA sequencing data. Due to the rapidly increasing usage of high throughput sequencing technologies in this area, there is an urgent need for algorithms that can accurately classify complex genomic rearrangements (involving more than one breakpoint or fusion). A precise understanding of SVs is important in the study of population diversity, cancer [2,3,4] and other diseases (e.g. Charcot-Marie Tooth [5] and autism [6]). The increasing usage of high throughput sequencing technologies has led to advances in the discovery and genotyping of structural variants in germline and somatic cells [7,8,9].

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: Jul 20, 2017
Citations: 8	License type: open-access

R Discovery Prime

R Discovery Prime

CLOVE: classification of genomic fusions into structural variation events

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

The impact of FASTQ and alignment read order on structural variant calling from long-read sequencing data.
Kyle J Lesack ... James D Wasmuth
PeerJ | VOL. 12
Kyle J Lesack, et. al.Kyle J Lesack ... James D Wasmuth
15 Mar 2024
PeerJ | VOL. 12

Abstract 5321: BreaKmer: Detection of structural rearrangements in targeted next-generation sequencing data using kmers
Ryan P Abo ... Vanessa Rojas-Rudilla
Cancer Research | VOL. 74
Ryan P Abo, et. al.Ryan P Abo ... Vanessa Rojas-Rudilla
30 Sep 2014
Cancer Research | VOL. 74

Identification and Interpretation of Clinically Relevant Somatic Variants from Whole-Genome Sequencing Data
Khurram Maqbool ... Valtteri Wirta
Blood | VOL. 140
Khurram Maqbool, et. al.Khurram Maqbool ... Valtteri Wirta
15 Nov 2022
Blood | VOL. 140

NanotatoR: a tool for enhanced annotation of genomic structural variants
Surajit Bhattacharya ... Hayk Barseghyan
BMC Genomics | VOL. 22
Surajit Bhattacharya, et. al.Surajit Bhattacharya ... Hayk Barseghyan
06 Jan 2021
BMC Genomics | VOL. 22

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

CLOVE: classification of genomic fusions into structural variation events

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics