Abstract

The knowledge of the genetic variability of the local population is of utmost importance in personalized medicine and has been revealed as a critical factor for the discovery of new disease variants. Here, we present the Collaborative Spanish Variability Server (CSVS), which currently contains more than 2000 genomes and exomes of unrelated Spanish individuals. This database has been generated in a collaborative crowdsourcing effort collecting sequencing data produced by local genomic projects and for other purposes. Sequences have been grouped by ICD10 upper categories. A web interface allows querying the database removing one or more ICD10 categories. In this way, aggregated counts of allele frequencies of the pseudo-control Spanish population can be obtained for diseases belonging to the category removed. Interestingly, in addition to pseudo-control studies, some population studies can be made, as, for example, prevalence of pharmacogenomic variants, etc. In addition, this genomic data has been used to define the first Spanish Genome Reference Panel (SGRP1.0) for imputation. This is the first local repository of variability entirely produced by a crowdsourcing effort and constitutes an example for future initiatives to characterize local variability worldwide. CSVS is also part of the GA4GH Beacon network.CSVS can be accessed at: http://csvs.babelomics.org/.

Highlights

  • Sequencing technologies have experienced an unprecedented development during the last decade [1] that resulted in different international collaborative projects [2,3,4] which contributed to an extraordinary increase in the knowledge of the mutational spectrum of diseases

  • More than 4500 monogenic diseases can nowadays be directly diagnosed by personalized genomics [7], a possibility that might soon be extended to the whole spectrum of rare diseases with a genetic background [8]

  • Genomic data of healthy individuals belonging to the local population of interest are often scarce when not unavailable

Read more

Summary

Introduction

Sequencing technologies have experienced an unprecedented development during the last decade [1] that resulted in different international collaborative projects [2,3,4] which contributed to an extraordinary increase in the knowledge of the mutational spectrum of diseases This generation of knowledge has been especially significant in diseases with high morbidity and mortality, caused by highly penetrant (typically protein-coding) variants [5,6]. The rationale is as follows: variants that are relatively common in a control population (common variation) are likely benign [10], while rare variants (especially if they have functional consequences) found in multiple affected cases but absent in the control population are likely to cause disease [11,12,13] These filters search for genes or variants present in all (or most) affected individuals but in none (or very few) of the unaffected control individuals. It seems clear that the availability of healthy controls is a decisive factor for the progress of discovery of new disease determinants

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call