Abstract

The development of integrative methods is one of the main challenges in bioinformatics. Network-based methods for the analysis of multiple gene-centered datasets take into account known and/or inferred relations between genes. In the last decades, the mathematical machinery of network diffusion—also referred to as network propagation—has been exploited in several network-based pipelines, thanks to its ability of amplifying association between genes that lie in network proximity. Indeed, network diffusion provides a quantitative estimation of network proximity between genes associated with one or more different data types, from simple binary vectors to real vectors. Therefore, this powerful data transformation method has also been increasingly used in integrative analyses of multiple collections of biological scores and/or one or more interaction networks. We present an overview of the state of the art of bioinformatics pipelines that use network diffusion processes for the integrative analysis of omics data. We discuss the fundamental ways in which network diffusion is exploited, open issues and potential developments in the field. Current trends suggest that network diffusion is a tool of broad utility in omics data analysis. It is reasonable to think that it will continue to be used and further refined as new data types arise (e.g. single cell datasets) and the identification of system-level patterns will be considered more and more important in omics data analysis.

Highlights

  • “Omics” technologies provide data related to different types of molecular entities (e.g. DNAs, RNAs, proteins) at increasing sensitivity, down to single-cell level (Hu et al, 2018)

  • Networkbased approaches enable the study of the relation between the topological and dynamical properties of a network and the biological system modelled by means of the network

  • The mathematical machinery of network diffusion (ND)— referred to as network propagation—has been exploited in many network-based pipelines with different aims, like gene prioritization, gene module identification, drug target prediction and disease subtyping, thanks to its ability of amplifying association between variables that lie in network proximity (Cowen et al, 2017)

Read more

Summary

INTRODUCTION

“Omics” technologies provide data related to different types of molecular entities (e.g. DNAs, RNAs, proteins) at increasing sensitivity, down to single-cell level (Hu et al, 2018). ND has been incorporated in many pipelines that jointly analyse biological networks and multiple collections of scores (“layers”) derived from omics assays These ND-based methods for multi-omics data analysis will be the main focus of this review. Three broad categories can be recognized (Table 1 and Figure 1D): the topology of the network in use can be defined by means of a priori knowledge, e.g. collected from molecular interactions databases; alternatively, a network can be inferred from the analysis of one or more biological datasets; lastly, a mixed approach that combines a priori and novel knowledge is possible. StSVM (Cun and Fröhlich, 2013) first integrates omics data and subsequently applies ND to define a TABLE 1 | Network diffusion based methods for the integrative analyses of multiple biological layers

Method
DISCUSSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call