Abstract

The number of national reference populations that are whole-genome sequenced are rapidly increasing. Partly driving this development is the fact that genetic disease studies benefit from knowing the genetic variation typical for the geographical area of interest. A whole-genome sequenced Swedish national reference population (n = 1000) has been recently published but with few samples from northern Sweden. In the present study we have whole-genome sequenced a control population (n = 300) (ACpop) from Västerbotten County, a sparsely populated region in northern Sweden previously shown to be genetically different from southern Sweden. The aggregated variant frequencies within ACpop are publicly available (DOI 10.17044/NBIS/G000005) to function as a basic resource in clinical genetics and for genetic studies. Our analysis of ACpop, representing approximately 0.11% of the population in Västerbotten, indicates the presence of a genetic substructure within the county. Furthermore, a demographic analysis showed that the population from which samples were drawn was to a large extent geographically stationary, a finding that was corroborated in the genetic analysis down to the level of municipalities. Including ACpop in the reference population when imputing unknown variants in a Västerbotten cohort resulted in a strong increase in the number of high-confidence imputed variants (up to 81% for variants with minor allele frequency < 5%). ACpop was initially designed for cancer disease studies, but the genetic structure within the cohort will be of general interest for all genetic disease studies in northern Sweden.

Highlights

  • The challenge for all studies on genetic diseases is to disentangle disease-causing genetic variation

  • To maximize the diversity among selected individuals and to minimize selection bias, 27 phenotypic, health, and lifestyle-related variables were extracted from the Vasterbotten Intervention Programme [24] (VIP) [25], and a principal component (PC) model was used to select the individuals to be sequenced from each municipality

  • We have whole genome sequenced 300 individuals intended to be used as a control population in genetic disease studies in the Swedish county of Vasterbotten and northern Sweden

Read more

Summary

Introduction

The challenge for all studies on genetic diseases is to disentangle disease-causing genetic variation. Subregional genetic differences in Northern Sweden application is reviewed by the Biobank expert committee, and through amendment of ethical permission and relevant data sharing agreements, the data can be shared. Aggregated data (variant frequencies) can be freely accessed through the National Bioinformatics Infrastructure Sweden site (https://swefreq.nbis.se/dataset/ACpop), DOI: 10. Aggregated data (variant frequencies) can be freely accessed through the National Bioinformatics Infrastructure Sweden site (https://swefreq.nbis.se/dataset/ACpop), DOI: 10. 17044/NBIS/G000005

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.