Abstract

Genes with similar roles in the cell cluster on chromosomes, thus benefiting from coordinated regulation. This allows gene function to be inferred by transferring annotations from genomic neighbors, following the guilt-by-association principle. We performed a systematic search for co-occurrence of >1000 gene functions in genomic neighborhoods across 1669 prokaryotic, 49 fungal and 80 metazoan genomes, revealing prevalent patterns that cannot be explained by clustering of functionally similar genes. It is a very common occurrence that pairs of dissimilar gene functions – corresponding to semantically distant Gene Ontology terms – are significantly co-located on chromosomes. These neighborhood associations are often as conserved across genomes as the known associations between similar functions, suggesting selective benefits from clustering of certain diverse functions, which may conceivably play complementary roles in the cell. We propose a simple encoding of chromosomal gene order, the neighborhood function profiles (NFP), which draws on diverse gene clustering patterns to predict gene function and phenotype. NFPs yield a 26–46% increase in predictive power over state-of-the-art approaches that propagate function across neighborhoods, thus providing hundreds of novel, high-confidence gene function inferences per genome. Furthermore, we demonstrate that copy number-neutral structural variation that shapes gene function distribution across chromosomes can predict phenotype of individuals from their genome sequence.

Highlights

  • Genes with similar roles in the cell cluster on chromosomes, benefiting from coordinated regulation

  • We used COG and NOG gene families, to each of which we assigned a set of gene functions, represented by Gene Ontology (GO) terms

  • Here we refer to function in a general sense, encompassing all three sub-ontologies of the GO: biological process (BP), molecular function (MF) and cellular component (CC)

Read more

Summary

Introduction

Genes with similar roles in the cell cluster on chromosomes, benefiting from coordinated regulation. It is a very common occurrence that pairs of dissimilar gene functions – corresponding to semantically distant Gene Ontology terms – are significantly co-located on chromosomes These neighborhood associations are often as conserved across genomes as the known associations between similar functions, suggesting selective benefits from clustering of certain diverse functions, which may conceivably play complementary roles in the cell. We searched for pairs of gene functions that are highly dissimilar, according to the structure of the Gene Ontology, yet that systematically cluster in genomic neighborhoods If found, such clustering patterns would be able to predict gene function by drawing on information which is not accessible to previous automated approaches, which propagate a particular gene function across genomic neighborhoods. This has implications for understanding genome evolution and brings practical benefits for methods to predict gene function and phenotype from the genome sequence

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.