Abstract

Understanding the core set of genes that are necessary for basic developmental functions is one of the central goals in biology. Studies in model organisms identified a significant fraction of essential genes through the analysis of null-mutations that lead to lethality. Recent large-scale next-generation sequencing efforts have provided unprecedented data on genetic variation in human. However, evolutionary and genomic characteristics of human essential genes have never been directly studied on a genome-wide scale. Here we use detailed phenotypic resources available for the mouse and deep genomics sequencing data from human populations to characterize patterns of genetic variation and mutational burden in a set of 2,472 human orthologs of known essential genes in the mouse. Consistent with the action of strong, purifying selection, these genes exhibit comparatively reduced levels of sequence variation, skew in allele frequency towards more rare, and exhibit increased conservation across the primate and rodent lineages relative to the remainder of genes in the genome. In individual genomes we observed ∼12 rare mutations within essential genes predicted to be damaging. Consistent with the hypothesis that mutations in essential genes are risk factors for neurodevelopmental disease, we show that de novo variants in patients with Autism Spectrum Disorder are more likely to occur in this collection of genes. While incomplete, our set of human orthologs shows characteristics fully consistent with essential function in human and thus provides a resource to inform and facilitate interpretation of sequence data in studies of human disease.

Highlights

  • Next-generation sequencing (NGS) technologies are routinely applied to evaluate the role of low-frequency and rare genetic variants in Mendelian and complex diseases [1,2,3]

  • By taking advantage of available sequence data in humans from large-scale sequencing studies [2,18], we aim to address two basic questions: are genes identified as ‘essential’ in the mouse evolutionarily conserved in humans, and second, how is this reflected in their mutational burden and impact on human disease? Our results show strong and consistent signatures of purifying selection within the set of essential genes, including increased sequence conservation, reduced number of exonic missense variants and an overall shift in allele frequency towards rare alleles

  • Leveraging these results, we show that de novo mutations in Autism Spectrum Disorder (ASD) cases are significantly enriched in this gene set in data from recent papers related to ASD

Read more

Summary

Introduction

Next-generation sequencing (NGS) technologies are routinely applied to evaluate the role of low-frequency and rare genetic variants in Mendelian and complex diseases [1,2,3]. There is intense interest in utilizing large-scale NGS datasets to characterize the natural background and burden of sequence variation in the human genome. The advent of large-scale NGS datasets allows for the first time to estimate the burden of variation in the human genome in a direct and unbiased manner. Recent studies leveraging NGS data to estimate the burden of damaging exonic missense variants report ,400 such variants per human genome [7,8]. With respect to loss-of-function (LoF) variants, a study of 185 human genome sequences finds a load of ,100 high-confidence LoF variants per genome [9]. A recent study of autosomal recessive disease variants in a genetic isolate finds surprisingly high carrier frequencies for many of these variants [8]. A study of the evolutionary origins of human protein coding variants reports that 86% of putative deleterious variants are of very recent origin (5,000–10,000 years) [10]

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.