Abstract
BackgroundEach cell type found within the human body performs a diverse and unique set of functions, the disruption of which can lead to disease. However, there currently exists no systematic mapping between cell types and the diseases they can cause.MethodsIn this study, we integrate protein–protein interaction data with high-quality cell-type-specific gene expression data from the FANTOM5 project to build the largest collection of cell-type-specific interactomes created to date. We develop a novel method, called gene set compactness (GSC), that contrasts the relative positions of disease-associated genes across 73 cell-type-specific interactomes to map genes associated with 196 diseases to the cell types they affect. We conduct text-mining of the PubMed database to produce an independent resource of disease-associated cell types, which we use to validate our method.ResultsThe GSC method successfully identifies known disease–cell-type associations, as well as highlighting associations that warrant further study. This includes mast cells and multiple sclerosis, a cell population currently being targeted in a multiple sclerosis phase 2 clinical trial. Furthermore, we build a cell-type-based diseasome using the cell types identified as manifesting each disease, offering insight into diseases linked through etiology.ConclusionsThe data set produced in this study represents the first large-scale mapping of diseases to the cell types in which they are manifested and will therefore be useful in the study of disease systems. Overall, we demonstrate that our approach links disease-associated genes to the phenotypes they produce, a key goal within systems medicine.Electronic supplementary materialThe online version of this article (doi:10.1186/s13073-015-0212-9) contains supplementary material, which is available to authorized users.
Highlights
Each cell type found within the human body performs a diverse and unique set of functions, the disruption of which can lead to disease
Cellular pathways are represented within protein–protein interaction (PPI) networks and because of this, sets of disease-associated genes tend to cluster within PPI networks [4, 40]
This is exemplified by the results produced by gene prioritization tools such as PRINCE [14], which use the clustering of disease-associated genes within PPI networks to prioritize candidate genes
Summary
Each cell type found within the human body performs a diverse and unique set of functions, the disruption of which can lead to disease. It is estimated that there are at least 400 different cell types present within the human body [1], each performing a unique repertoire of functions, the disruption of which may lead to the development of a disease [2]. The cell types that these genes directly affect and through which promote disease development have yet to be characterized or are still being debated. Identification of these cell types will further our understanding of the genetic basis of these diseases and the underpinning molecular pathways and processes. We refer to the cell types directly affected by the disease-associated genes as the disease-manifesting cell types
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have