Phenotypic differences between species are, in significant part, determined by their proteomic diversity. The link between proteomic and phenotypic diversity can be best understood in the context of the various pathways and biological processes in which proteins participate. While the conservation pattern for individual proteins across species is expected to follow the phylogenetic relationships among the species, the diversity patterns of individual pathways may not: certain pathways may be much more conserved among distantly related species than two closely related species, owing to the ecological histories of the species. Thus, a pathway-centric analysis of proteome conservation and diversity has important implications for the appropriate choice of a model organism when investigating specific aspects of human biology. Exploiting the complete genome sequences and protein-coding gene annotations, here we perform a comprehensive gene-set-centric analysis of proteomic diversity between humans and 54 eukaryotic organisms, resulting in a catalog of organisms that are most similar to humans in terms of specific pathways, processes, expression patterns, and diseases. We corroborate our findings using species-specific mass spectrometry data.Our analysis provides a general framework to identify conserved and unique pathways in a group of organisms and a resource to prioritize appropriate model systems to study a specific biological system in a reference organism such as humans.
Read full abstract