Abstract
We have assessed the numbers of potentially deleterious variants in the genomes of apparently healthy humans by using (1) low-coverage whole-genome sequence data from 179 individuals in the 1000 Genomes Pilot Project and (2) current predictions and databases of deleterious variants. Each individual carried 281-515 missense substitutions, 40-85 of which were homozygous, predicted to be highly damaging. They also carried 40-110 variants classified by the Human Gene Mutation Database (HGMD) as disease-causing mutations (DMs), 3-24 variants in the homozygous state, and many polymorphisms putatively associated with disease. Whereas many of these DMs are likely to represent disease-allele-annotation errors, between 0 and 8 DMs (0-1 homozygous) per individual are predicted to be highly damaging, and some of them provide information of medical relevance. These analyses emphasize the need for improved annotation of disease alleles both in mutation databases and in the primary literature; some HGMD mutation data have been recategorized on the basis of the present findings, an iterative process that is both necessary and ongoing. Our estimates of deleterious-allele numbers are likely to be subject to both overcounting and undercounting. However, our current best mean estimates of ~400 damaging variants and ~2 bona fide disease mutations per individual are likely to increase rather than decrease as sequencing studies ascertain rare variants more effectively and as additional disease alleles are discovered.
Highlights
Genetic variation contributes to human ill health
The comprehensive catalogs of both high-penetrance variants underlying Mendelian disorders (Online Mendelian Inheritance in Man and Human Gene Mutation database [Human Gene Mutation Database (HGMD)])[1] and low-penetrance variants contributing to complex disorders (National Human Genome Research Institute [NHGRI] Catalog of Published Genome-wide Association Studies) attest to the progress made to date
It is clearly important that such uncertain records be identified in order that genomic sequences can be reliably interpreted in a medical context, and this will be increasingly relevant as we enter a new era of personalized genomics.[5]
Summary
Genetic variation contributes to human ill health. identifying the variants that underlie the disease phenotypes (such variants are referred to here as ‘‘disease variants’’ or ‘‘disease alleles’’) of affected individuals has been an important goal of medical geneticists for decades. The comprehensive catalogs of both high-penetrance variants underlying Mendelian disorders (Online Mendelian Inheritance in Man and Human Gene Mutation database [HGMD])[1] and low-penetrance variants contributing to complex disorders (National Human Genome Research Institute [NHGRI] Catalog of Published Genome-wide Association Studies) attest to the progress made to date. Healthy individuals can, for a number of reasons, carry many disadvantageous variants without showing any obvious ill effects: (1) they might carry a single disease allele for a severe high-penetrance recessive disorder that requires two alleles to manifest the disease phenotype, (2) the disorder might be late in onset or require additional genetic and/or environmental factors for expression (reduced penetrance), (3) or the clinical phenotype might be mild and classified as lying within the range of normal healthy variation. It is clearly important that such uncertain records be identified in order that genomic sequences can be reliably interpreted in a medical context, and this will be increasingly relevant as we enter a new era of personalized genomics.[5]
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have