Abstract

The development of analytical methods for Genome-wide Association Studies (GWAS) has outpaced the evolution of simulation techniques and pipelines. This disparity underscores the importance of innovative simulation methods that can keep pace with the rapidly increasing scale of GWAS. The median sample size of GWAS over the past ten years has exceeded 50,000 individuals, a trend that emphasizes the need for simulation tools capable of generating data on a similar or larger scale. This paper introduces a novel method, the small-group originating (SGO) model, utilizing the SLiM software for simulating individual-level GWAS data. Our standardized protocol facilitates the generation of tens of thousands of pseudo-individuals with millions of variants from small (30−90) open-access datasets.SGO stands out, especially when compared to the widely-used resampling method in HapGen, showcasing superior simulation efficiency for large sample sizes (> 13,000) of unrelated individuals. This capability is particularly relevant given the current trajectory towards larger GWAS, necessitating tools that can simulate datasets reflective of this growth. Additionally, SGO provides customization options and can model dynamic life cycles and mating across generations, positioning it as a highly promising alternative for GWAS simulations.In a case study, sensitivity analyses of chromosome-level principal component analysis and kinship coefficient estimation were conducted. The results highlighted the poor robustness of chromosome-level quality control (QC) indexes and the uneven distribution of population structure across chromosomes and ancestries, advocating for the caution against relying solely on chromosome-level QC statistics.With its flexible and efficient approach to generating pseudo GWAS data, our standardized SGO protocol emerges as a crucial asset for method development, power analysis, and benchmarking in GWAS research. It is especially vital in the context of accommodating the demands for large-scale simulations, aligning with the current and future scale of GWAS.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.