Abstract

BackgroundAdvances in whole genome profiling have revolutionized the cancer research field, but at the same time have raised new bioinformatics challenges. For next generation sequencing (NGS), these include data storage, computational costs, sequence processing and alignment, delineating appropriate statistical measures, and data visualization. Currently there is a lack of workflows for efficient analysis of large, MethylCap-seq datasets containing multiple sample groups.MethodsThe NGS application MethylCap-seq involves the in vitro capture of methylated DNA and subsequent analysis of enriched fragments by massively parallel sequencing. The workflow we describe performs MethylCap-seq experimental Quality Control (QC), sequence file processing and alignment, differential methylation analysis of multiple biological groups, hierarchical clustering, assessment of genome-wide methylation patterns, and preparation of files for data visualization.ResultsHere, we present a scalable, flexible workflow for MethylCap-seq QC, secondary data analysis, tertiary analysis of multiple experimental groups, and data visualization. We demonstrate the experimental QC procedure with results from a large ovarian cancer study dataset and propose parameters which can identify problematic experiments. Promoter methylation profiling and hierarchical clustering analyses are demonstrated for four groups of acute myeloid leukemia (AML) patients. We propose a Global Methylation Indicator (GMI) function to assess genome-wide changes in methylation patterns between experimental groups. We also show how the workflow facilitates data visualization in a web browser with the application Anno-J.ConclusionsThis workflow and its suite of features will assist biologists in conducting methylation profiling projects and facilitate meaningful biological interpretation.

Highlights

  • Advances in whole genome profiling have revolutionized the cancer research field, but at the same time have raised new bioinformatics challenges

  • The addition of methyl groups to the 5’ carbon position of cytosine bases is a major mechanism of epigenetic regulation which participates in reorganizing chromatin structure and silencing gene expression [2], Epigenetic alterations, such as tumor suppressor gene hypermethylation and oncogene hypomethylation, are hallmarks of cancer and play a pivotal role in tumorgenesis and disease progression [3,4]

  • Experimental quality control The automated MethylCap-seq workflow has been developed over the course of 200 sequencing runs

Read more

Summary

Introduction

Advances in whole genome profiling have revolutionized the cancer research field, but at the same time have raised new bioinformatics challenges. Advances in whole genome profiling technologies have revolutionized the field of cancer research. These technologies have facilitated the discovery of potential biomarkers for disease development and progression as well as our. The DNA methylation profiling approach used in our lab, MethylCap-seq involves the in vitro capture of methylated DNA with the high affinity methyl-CpG binding domain of human MBD2 protein and subsequent analysis of enriched fragments by massively parallel sequencing [5,6,7,8]. Benchmarking has shown MethylCap-seq is more effective at interrogating CpG islands than antibody-based methylated DNA immunoprecipitation sequencing (MeDIP-seq) [9]. While optimizing this experimental technique, we recognized two potential issues affecting subsequent data analysis. Spurious samples reduce analytical power and lead to excess “noise” in downstream analyses

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call