Abstract

We present the analysis of twenty human genomes to evaluate the prospects for identifying rare functional variants that contribute to a phenotype of interest. We sequenced at high coverage ten “case” genomes from individuals with severe hemophilia A and ten “control” genomes. We summarize the number of genetic variants emerging from a study of this magnitude, and provide a proof of concept for the identification of rare and highly-penetrant functional variants by confirming that the cause of hemophilia A is easily recognizable in this data set. We also show that the number of novel single nucleotide variants (SNVs) discovered per genome seems to stabilize at about 144,000 new variants per genome, after the first 15 individuals have been sequenced. Finally, we find that, on average, each genome carries 165 homozygous protein-truncating or stop loss variants in genes representing a diverse set of pathways.

Highlights

  • The technology to sequence entire human genomes has evolved rapidly in recent years

  • We report here the nearly complete genomic sequence of 20 different individuals, determined using ‘‘next-generation’’ sequencing technologies. We use these data to characterize the type of genetic variation carried by humans in a sample of this size, which is to our knowledge the largest set of unrelated genomic sequences that have been reported

  • This work provides important fundamental information about the scope of human genetic variation, and suggests ways to further explore the relationship between these genetic variants and human disease

Read more

Summary

Introduction

Massively-parallel sequencing techniques have been developed, and it is possible to sequence an entire human genome in little more than a week. Programs to align these short reads and call the resulting variants are being developed and optimized [1,2], and the cost to sequence a genome has plummeted. Single human genomes have been sequenced on a number of different next-generation sequencing platforms [3,4,5,6]. As a first step in that direction, we have characterized the patterns of variation observed in 20 human genomes that were sequenced at high coverage using the Illumina Genome Analyzer IIx platform

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call