A data driven approach reveals disease similarity on a molecular level

Kleanthi Lakiotaki,Elias Castanas,Giorgos Borboudakis,George Georgakopoulos,Oluf Dimitri Røe,Oluf Dimitri Røe,Ioannis Tsamardinos,Ioannis Tsamardinos

doi:10.1038/s41540-019-0117-0

Abstract

Could there be unexpected similarities between different studies, diseases, or treatments, on a molecular level due to common biological mechanisms involved? To answer this question, we develop a method for computing similarities between empirical, statistical distributions of high-dimensional, low-sample datasets, and apply it on hundreds of -omics studies. The similarities lead to dataset-to-dataset networks visualizing the landscape of a large portion of biological data. Potentially interesting similarities connecting studies of different diseases are assembled in a disease-to-disease network. Exploring it, we discover numerous non-trivial connections between Alzheimer’s disease and schizophrenia, asthma and psoriasis, or liver cancer and obesity, to name a few. We then present a method that identifies the molecular quantities and pathways that contribute the most to the identified similarities and could point to novel drug targets or provide biological insights. The proposed method acts as a “statistical telescope” providing a global view of the constellation of biological data; readers can peek through it at: http://datascope.csd.uoc.gr:25000/.

Highlights

Public biological data repositories currently hold tens of thousands of-datasets
The question arises: how do the measurements from these studies compare against each other, what are their relations, and what is the collective, emerging picture and biological intuition they provide? Can we construct and look through a “statistical telescope” instead? Could it be that different diseases, treatments, other experimental or sampling conditions induce similar biological molecular patterns pointing to common pathophysiological pathways? Their identification could accelerate the deeper understanding of human pathology and the exploitation of clinical study results
Overall 103,088 Homo Sapiens and Mus Musculus samples were employed in the subsequent analyses and results, grouped in 978 datasets and spanning more than 500 different diseases and phenotypes, as revealed by automated text analysis.[2]

Summary

Introduction

Public biological data repositories currently hold tens of thousands of (bio)-datasets. As of October 2019, the NCBI Gene Expression Omnibus (GEO)[1] contains 3,263,365 microarray and RNA-Seq profiles, grouped into 119,386 data series. Each dataset studies a specific biological question, regarding a disease, a treatment, or a phenotype. Examples include finding the gene expression differences between malignant and benign breast tissue or creating a diagnostic model between primary and metastatic lung cancer tumors. Data analysis methods typically focus on individually analyzing each dataset, like a “statistical microscope”. The question arises: how do the measurements from these studies compare against each other, what are their relations, and what is the collective, emerging picture and biological intuition they provide? Could it be that different diseases, treatments, other experimental or sampling conditions induce similar biological molecular patterns pointing to common pathophysiological pathways? The question arises: how do the measurements from these studies compare against each other, what are their relations, and what is the collective, emerging picture and biological intuition they provide? Can we construct and look through a “statistical telescope” instead? Could it be that different diseases, treatments, other experimental or sampling conditions induce similar biological molecular patterns pointing to common pathophysiological pathways? Their identification could accelerate the deeper understanding of human pathology and the exploitation of clinical study results

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: NPJ systems biology and applications	Publication Date: Oct 25, 2019
Citations: 11	License type: open-access

R Discovery Prime

R Discovery Prime

A data driven approach reveals disease similarity on a molecular level

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: NPJ systems biology and applications

Lead the way for us

Similar Papers

Book Review
Margaret Sunde
Protein Science | VOL. 13
Margaret SundeMargaret Sunde
01 Jun 2004
Protein Science | VOL. 13

Using the genome to correct the ion transport defect in cystic fibrosis.
Margarida D Amaral
The Journal of physiology | VOL. 601
Margarida D AmaralMargarida D Amaral
30 Sep 2022
The Journal of physiology | VOL. 601

Analysis of aberrant pathways using HCC candidate biomarkers identified from high-throuput omics studies
Jinlian Wang ... H W Ressom
-
Jinlian Wang, et. al. Jinlian Wang ... H W Ressom
01 Nov 2011
01 Nov 2011

Application of radiation omics in the development of adverse outcome pathway networks: an example of radiation-induced cardiovascular disease
Omid Azimzadeh ... Nobuyuki Hamada
International Journal of Radiation Biology | VOL. 98
Omid Azimzadeh, et. al.Omid Azimzadeh ... Nobuyuki Hamada
24 Aug 2022
International Journal of Radiation Biology | VOL. 98

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A data driven approach reveals disease similarity on a molecular level

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: NPJ systems biology and applications