Ngs.plot: Quick mining and visualization of next-generation sequencing data by integrating genomic databases.

Li Shen,Ningyi Shao,Eric Nestler,Xiaochuan Liu

doi:10.1186/1471-2164-15-284

Abstract

BackgroundUnderstanding the relationship between the millions of functional DNA elements and their protein regulators, and how they work in conjunction to manifest diverse phenotypes, is key to advancing our understanding of the mammalian genome. Next-generation sequencing technology is now used widely to probe these protein-DNA interactions and to profile gene expression at a genome-wide scale. As the cost of DNA sequencing continues to fall, the interpretation of the ever increasing amount of data generated represents a considerable challenge.ResultsWe have developed ngs.plot – a standalone program to visualize enrichment patterns of DNA-interacting proteins at functionally important regions based on next-generation sequencing data. We demonstrate that ngs.plot is not only efficient but also scalable. We use a few examples to demonstrate that ngs.plot is easy to use and yet very powerful to generate figures that are publication ready.ConclusionsWe conclude that ngs.plot is a useful tool to help fill the gap between massive datasets and genomic information in this era of big sequencing data.

Highlights

Understanding the relationship between the millions of functional DNA elements and their protein regulators, and how they work in conjunction to manifest diverse phenotypes, is key to advancing our understanding of the mammalian genome
Its ability to produce more than one billion sequencing reads within the timeframe of a few days [1] has enabled the investigation of tens of thousands of biological events in parallel [2,3]
Designing a genome browser that can effectively manage the enormous amount of genomic information has become an important research topic in the past decade with dozens of tools being developed to date [6,7,8]

Summary

Background

Generation sequencing (NGS) technology has become the de facto indispensable tool to study genomics and epigenomics in recent years. As sequencing output has increased rapidly in recent years (which inevitably creates values at originally zerovalue regions), this strategy soon became a major problem: the RLE files grew too large and consumed a lot of memory during loading Another challenge arose when dealing with epigenomic marks that have broad patterns of enrichment – the coverage vectors are dense and may consume a lot of memory. Gene deserts, pericentromeres and subtelomeres are used to build a genome package for the “region analysis” utility (https://github.com/shenlabsinai/region_analysis) on the fly, which is used to perform location-based classifications on CGIs and DHSs. In total, more than 60 million functional elements have been incorporated into ngs.plot’s database so far (Table 1). The differential chromatin modification sites were detected by diffReps [42] using default parameters and the FDR cutoff was set as 0.1

Results and discussion

Conclusion

Metzker ML

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Genomics	Publication Date: Jan 1, 2014
Citations: 856	License type: cc-by

R Discovery Prime

R Discovery Prime

Ngs.plot: Quick mining and visualization of next-generation sequencing data by integrating genomic databases.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics

Lead the way for us

Similar Papers

Bioinformatics Methods and Biological Interpretation for Next-Generation Sequencing Data.
Guohua Wang ... Dongxiao Zhu
BioMed Research International | VOL. 2015
Guohua Wang, et. al.Guohua Wang ... Dongxiao Zhu
01 Jan 2015
BioMed Research International | VOL. 2015

Short Read (Next-Generation) Sequencing
Jaya Punetha ... Eric P Hoffman
Circulation: Cardiovascular Genetics | VOL. 6
Jaya Punetha, et. al.Jaya Punetha ... Eric P Hoffman
14 Jul 2013
Circulation: Cardiovascular Genetics | VOL. 6

IGV-plus: A Java Software for the Analysis and Visualization of Next-Generation Sequencing Data
Antonio Agliata ... Mario Rosario Guarracino
-
Antonio Agliata, et. al.Antonio Agliata ... Mario Rosario Guarracino
01 Jan 2014
01 Jan 2014

Towards standardization of the description and publication of next‐generation sequencing datasets of fungal communities
R Henrik Nilsson ... Håvard Kauserud
New Phytologist | VOL. 191
R Henrik Nilsson, et. al.R Henrik Nilsson ... Håvard Kauserud
09 May 2011
New Phytologist | VOL. 191

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ngs.plot: Quick mining and visualization of next-generation sequencing data by integrating genomic databases.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics