GSuite HyperBrowser: integrative analysis of dataset collections across the genome and epigenome.

Boris Simovski,Mads Bengtsen,Ragnhild Eskeland,Ivar Grytten,Geir Kjetil Sandve,Johannes Andreas Akse,Bastian Fromm,Egil Ferkingstad,Daniel Vodák,Christin Lund-Andersen,Eivind Hovig,Lars Holden,Morten Johansen,Finn Drabløs,Hildur Sif Thorarensen,Sveinung Gundersen,Abdulrahman Azab,Sigve Nakken,Alexander Johan Nederbragt,Antonio M Mora ,Odd S Gabrielsen ,Knut Dagestad Rand ,Diana Domańska ,Ingrid K Glad ,Matthew T G Holden

doi:10.1093/gigascience/gix032

Abstract

Background:Recent large-scale undertakings such as ENCODE and Roadmap Epigenomics have generated experimental data mapped to the human reference genome (as genomic tracks) representing a variety of functional elements across a large number of cell types. Despite the high potential value of these publicly available data for a broad variety of investigations, little attention has been given to the analytical methodology necessary for their widespread utilisation.Findings:We here present a first principled treatment of the analysis of collections of genomic tracks. We have developed novel computational and statistical methodology to permit comparative and confirmatory analyses across multiple and disparate data sources. We delineate a set of generic questions that are useful across a broad range of investigations and discuss the implications of choosing different statistical measures and null models. Examples include contrasting analyses across different tissues or diseases. The methodology has been implemented in a comprehensive open-source software system, the GSuite HyperBrowser. To make the functionality accessible to biologists, and to facilitate reproducible analysis, we have also developed a web-based interface providing an expertly guided and customizable way of utilizing the methodology. With this system, many novel biological questions can flexibly be posed and rapidly answered.Conclusions:Through a combination of streamlined data acquisition, interoperable representation of dataset collections, and customizable statistical analysis with guided setup and interpretation, the GSuite HyperBrowser represents a first comprehensive solution for integrative analysis of track collections across the genome and epigenome. The software is available at: https://hyperbrowser.uio.no.

Highlights

Recent large-scale undertakings such as Encyclopedia of DNA Elements (ENCODE) and Roadmap Epigenomics have generated experimental data mapped to the human reference genome representing a variety of functional elements across a large number of cell types
Most of these datasets are in the form of genomic tracks, i.e., sets of elements anchored to locations in a reference genome, which provide a good foundation for the integration of data representing disparate genomic features
The present work is concerned with sets of information elements anchored to specific coordinates in a reference genome, which we refer to as genomic tracks

Summary

Introduction

Recent large-scale undertakings such as ENCODE and Roadmap Epigenomics have generated experimental data mapped to the human reference genome (as genomic tracks) representing a variety of functional elements across a large number of cell types. The Encyclopedia of DNA Elements (ENCODE) [1] project marked a substantial leap in this respect by making available to the human genomics community a broad collection of cell line–specific data on DNA accessibility and transcription factor binding. Kundaje et al [2] refer to the combined collection of ENCODE and Roadmap data as 127 human reference epigenomes Most of these datasets are in the form of genomic tracks, i.e., sets of elements anchored to locations in a reference genome, which provide a good foundation for the integration of data representing disparate genomic features

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: GigaScience	Publication Date: Apr 27, 2017
Citations: 23	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

GSuite HyperBrowser: integrative analysis of dataset collections across the genome and epigenome.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: GigaScience

Lead the way for us

Similar Papers

Integrating data from disparate data systems for improved HIV reporting: Lessons learned
Kamran Ahmed ... Phillip J Peters
Online Journal of Public Health Informatics | VOL. 10
Kamran Ahmed, et. al.Kamran Ahmed ... Phillip J Peters
22 May 2018
Online Journal of Public Health Informatics | VOL. 10

OBDAIR: Ontology-Based Distributed framework for Accessing, Integrating and Reasoning with data in disparate data sources
Georgios Santipantakis ... George A Vouros
Expert systems with applications | VOL. 90
Georgios Santipantakis, et. al.Georgios Santipantakis ... George A Vouros
24 Aug 2017
Expert systems with applications | VOL. 90

Stock market one-day ahead movement prediction using disparate data sources
Bin Weng ... Fadel M Megahed
Expert systems with applications | VOL. 79
Bin Weng, et. al.Bin Weng ... Fadel M Megahed
28 Feb 2017
Expert systems with applications | VOL. 79

Assessing methods for comparing species diversity from disparate data sources: the case of urban and peri‐urban forests
Christina L Staudhammer ... Francisco J Escobedo
Ecosphere | VOL. 9
Christina L Staudhammer, et. al.Christina L Staudhammer ... Francisco J Escobedo
01 Oct 2018
Ecosphere | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

GSuite HyperBrowser: integrative analysis of dataset collections across the genome and epigenome.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: GigaScience