ANIMA: Association network integration for multiscale analysis.

Armin Deffur,Nicola M Mulder,Bongani M Mayosi,Robert J Wilkinson

doi:10.12688/wellcomeopenres.14073.3

Abstract

Contextual functional interpretation of -omics data derived from clinical samples is a classical and difficult problem in computational systems biology. The measurement of thousands of data points on single samples has become routine but relating 'big data' datasets to the complexities of human pathobiology is an area of ongoing research. Complicating this is the fact that many publicly available datasets use bulk transcriptomics data from complex tissues like blood. The most prevalent analytic approaches derive molecular 'signatures' of disease states or apply modular analysis frameworks to the data. Here we describe ANIMA (association network integration for multiscale analysis), a network-based data integration method using clinical phenotype and microarray data as inputs. ANIMA is implemented in R and Neo4j and runs in Docker containers. In short, the build algorithm iterates over one or more transcriptomics datasets to generate a large, multipartite association network by executing multiple independent analytic steps (differential expression, deconvolution, modular analysis based on co-expression, pathway analysis) and integrating the results. Once the network is built, it can be queried directly using Cypher (a graph query language), or by custom functions that communicate with the graph database via language-specific APIs. We developed a web application using Shiny, which provides fully interactive, multiscale views of the data. Using our approach, we show that we can reconstruct multiple features of disease states at various scales of organization, from transcript abundance patterns of individual genes through co-expression patterns of groups of genes to patterns of cellular behaviour in whole blood samples, both in single experiments as well in meta-analyses of multiple datasets.

Highlights

A frequent issue with bioinformatic analysis is the following scenario: a given dataset is analysed using various approaches in a linear workflow, in an attempt to extract biological features of interest from the data, often at different scales
Where the datasets contained samples from multiple timepoints, we restricted the analysis to healthy controls and the first disease timepoint
In addition to the standard web browser interface for Neo4j, into which the above query can be directly entered and returned in the browser window (Figure 2A), we provide a function in R that returns the network found, and plots this within R (Figure 2B), exports the network as node and edge lists for import in other software like Cytoscape (Figure 2C), or returns the result as an igraph[23] object for further manipulation within R. (A file ANIMA_styles.xml is included in the common folder supplied with the source code; this is a Cytoscape stylesheet that reproduces the colouring shown in the Figure 2C)

Summary

Introduction

A frequent issue with bioinformatic analysis is the following scenario: a given dataset is analysed using various approaches in a linear workflow, in an attempt to extract biological features of interest from the data, often at different scales. Providing the code used in analysis and the raw data does not guarantee reproducibility, as the computational environment in which the analysis is run can influence the outcome of computations. This becomes an issue when the functionality of a software package changes between versions. In addition to data and source code, one has to provide the exact configuration and package versions of all software involved in the project. The containerization platform Docker[4] is frequently used to fulfil this function and has enjoyed widespread adoption in reproducible research This is exemplified by Nextflow[5], a workflow management system using Docker

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Wellcome Open Research	Publication Date: Nov 14, 2018
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

ANIMA: Association network integration for multiscale analysis.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Wellcome Open Research

Lead the way for us

Similar Papers

ANIMA: Association network integration for multiscale analysis
Emre Guney ... Robert J Wilkinson
Wellcome Open Research | VOL. 3
Emre Guney, et. al.Emre Guney ... Robert J Wilkinson
20 May 2018
Wellcome Open Research | VOL. 3

ANIMA: Association network integration for multiscale analysis
Armin Deffur ... Robert J Wilkinson
Wellcome Open Research | VOL. 3
Armin Deffur, et. al.Armin Deffur ... Robert J Wilkinson
12 Mar 2018
Wellcome Open Research | VOL. 3

ANIMA: Association network integration for multiscale analysis.
Armin Deffur ... Nicola M Mulder
Wellcome Open Research | VOL. 3
Armin Deffur, et. al.Armin Deffur ... Nicola M Mulder
05 Jun 2018
Wellcome Open Research | VOL. 3

Are graph databases ready for bioinformatics?
Christian Theil Have ... Lars Juhl Jensen
Bioinformatics | VOL. 29
Christian Theil Have, et. al.Christian Theil Have ... Lars Juhl Jensen
17 Oct 2013
Bioinformatics | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ANIMA: Association network integration for multiscale analysis.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Wellcome Open Research