Wg-blimp: an end-to-end analysis pipeline for whole genome bisulfite sequencing data

Marius Wöste,Sandra Laurentino,Christopher Schröder,Sven Rahmann,Bernhard Horsthemke,Elsa Leitão

doi:10.1186/s12859-020-3470-5

Abstract

BackgroundAnalysing whole genome bisulfite sequencing datasets is a data-intensive task that requires comprehensive and reproducible workflows to generate valid results. While many algorithms have been developed for tasks such as alignment, comprehensive end-to-end pipelines are still sparse. Furthermore, previous pipelines lack features or show technical deficiencies, thus impeding analyses.ResultsWe developed wg-blimp (whole genome bisulfite sequencing methylation analysis pipeline) as an end-to-end pipeline to ease whole genome bisulfite sequencing data analysis. It integrates established algorithms for alignment, quality control, methylation calling, detection of differentially methylated regions, and methylome segmentation, requiring only a reference genome and raw sequencing data as input. Comparing wg-blimp to previous end-to-end pipelines reveals similar setups for common sequence processing tasks, but shows differences for post-alignment analyses. We improve on previous pipelines by providing a more comprehensive analysis workflow as well as an interactive user interface. To demonstrate wg-blimp’s ability to produce correct results we used it to call differentially methylated regions for two publicly available datasets. We were able to replicate 112 of 114 previously published regions, and found results to be consistent with previous findings. We further applied wg-blimp to a publicly available sample of embryonic stem cells to showcase methylome segmentation. As expected, unmethylated regions were in close proximity of transcription start sites. Segmentation results were consistent with previous analyses, despite different reference genomes and sequencing techniques.Conclusionswg-blimp provides a comprehensive analysis pipeline for whole genome bisulfite sequencing data as well as a user interface for simplified result inspection. We demonstrated its applicability by analysing multiple publicly available datasets. Thus, wg-blimp is a relevant alternative to previous analysis pipelines and may facilitate future epigenetic research.

Highlights

Analysing whole genome bisulfite sequencing datasets is a data-intensive task that requires comprehensive and reproducible workflows to generate valid results
Implementation We present here wg-blimp, a workflow for automated in silico processing of whole genome bisulfite sequencing (WGBS) data
Comparison to previous pipelines Since wg-blimp only integrates published software, and exhaustive evaluation of all conceivable pipeline setups would result in combinatorial explosion, we focus here on a feature-wise comparison of pipelines, similar to [19]

Summary

Introduction

Analysing whole genome bisulfite sequencing datasets is a data-intensive task that requires comprehensive and reproducible workflows to generate valid results. Implementation We present here wg-blimp (whole genome bisulfite sequencing methylation analysis pipeline), a workflow for automated in silico processing of WGBS data. It consists of a comprehensive WGBS data analysis pipeline as well as a user interface for simplified inspection of datasets and potential sharing of results with other researchers.

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Bioinformatics	Publication Date: May 1, 2020
Citations: 17	License type: open-access

R Discovery Prime

R Discovery Prime

Wg-blimp: an end-to-end analysis pipeline for whole genome bisulfite sequencing data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

Identification of cell type-specific methylation signals in bulk whole genome bisulfite sequencing data
C Anthony Scott ... Cristian Coarfa
Genome Biology | VOL. 21
C Anthony Scott, et. al.C Anthony Scott ... Cristian Coarfa
01 Jul 2020
Genome Biology | VOL. 21

A pipeline for sample tagging of whole genome bisulfite sequencing data using genotypes of whole genome sequencing
Zhe Xu ... Yanfeng Shi
BMC Genomics | VOL. 24
Zhe Xu, et. al.Zhe Xu ... Yanfeng Shi
23 Jun 2023
BMC Genomics | VOL. 24

Bicycle: a bioinformatics pipeline to analyze bisulfite sequencing data.
Osvaldo Graña ... Hugo López-Fernández
Bioinformatics | VOL. 34
Osvaldo Graña, et. al.Osvaldo Graña ... Hugo López-Fernández
01 Dec 2017
Bioinformatics | VOL. 34

GBSA: a comprehensive software for analysing whole genome bisulfite sequencing data
Touati Benoukraf ... Mengchu Wu
Nucleic Acids Research | VOL. 41
Touati Benoukraf, et. al.Touati Benoukraf ... Mengchu Wu
24 Dec 2012
Nucleic Acids Research | VOL. 41

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Wg-blimp: an end-to-end analysis pipeline for whole genome bisulfite sequencing data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics