Abstract

The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.

Highlights

  • The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases

  • We describe high-coverage whole-genome sequencing (WGS) analyses of the first 53,831 TOPMed samples (Box 1 and Extended Data Tables 1, 2); additional data are being made available as quality control, variant calling and dbGaP curation are completed

  • WGS of the TOPMed samples was performed over multiple studies, years and sequencing centres

Read more

Summary

Introduction

The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. Advancing DNA-sequencing technologies and decreasing costs are enabling researchers to explore human genetic variation at an unprecedented scale[2,3] For these advances to improve our understanding of human health, they must be deployed in well-phenotyped human samples and used to build resources such as variation catalogues[3,4], control collections[5,6] and imputation reference panels[7,8,9]. Akey goal of the TOPMed programme is to understand risk factors for heart, lung, blood and sleep disorders by adding WGS and other ‘omics’ data to existing studies with deep phenotyping (Supplementary Information 1.1 and Supplementary Fig. 1). The 53,831 samples described here are drawn from TOPMed freeze 5

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call