Abstract
BackgroundLongitudinal phenotypic data provides a rich potential resource for genetic studies which may allow for greater understanding of variants and their covariates over time. Herein, we review 3 longitudinal analytical approaches from the Genetic Analysis Workshop 19 (GAW19). These contributions investigated both genome-wide association (GWA) and whole genome sequence (WGS) data from odd numbered chromosomes on up to 4 time points for blood pressure–related phenotypes. The statistical models used included generalized estimating equations (GEEs), latent class growth modeling (LCGM), linear mixed-effect (LME), and variance components (VC). The goal of these analyses was to test statistical approaches that use repeat measurements to increase genetic signal for variant identification.ResultsTwo analytical methods were applied to the GAW19: GWA using real phenotypic data, and one approach to WGS using 200 simulated replicates. The first GWA approach applied a GEE-based model to identify gene-based associations with 4 derived hypertension phenotypes. This GEE model identified 1 significant locus, GRM7, which passed multiple test corrections for 2 hypertension-derived traits. The second GWA approach employed the LME to estimate genetic associations with systolic blood pressure (SBP) change trajectories identified using LCGM. This LCGM method identified 5 SBP trajectories and association analyses identified a genome-wide significant locus, near ATOX1 (p = 1.0E−8). Finally, a third VC-based model using WGS and simulated SBP phenotypes that constrained the β coefficient for a genetic variant across each time point was calculated and compared to an unconstrained approach. This constrained VC approach demonstrated increased power for WGS variants of moderate effect, but when larger genetic effects were present, averaging across time points was as effective.ConclusionIn this paper, we summarize 3 GAW19 contributions applying novel statistical methods and testing previously proposed techniques under alternative conditions for longitudinal genetic association. We conclude that these approaches when appropriately applied have the potential to: (a) increase statistical power; (b) decrease trait heterogeneity and standard error; (c) decrease computational burden in WGS; and (d) have the potential to identify genetic variants influencing subphenotypes important for understanding disease progression.
Highlights
Longitudinal phenotypic data provides a rich potential resource for genetic studies which may allow for greater understanding of variants and their covariates over time
The computational routines utilized need to be properly evaluated, as advanced statistical methods such as generalized estimating equations (GEEs) and linear mixed-effect (LME) models that account for the addition of pedigree structure may not be scalable to large genetic data sets
We summarize 3 Genetic Analysis Workshop 19 (GAW19) contributions (Table 1) that are focused on the development of statistical methods using repeat measurement and either genome-wide association (GWA) [10, 11] or whole genome sequence (WGS) [12]
Summary
Longitudinal phenotypic data provides a rich potential resource for genetic studies which may allow for greater understanding of variants and their covariates over time. The statistical models used included generalized estimating equations (GEEs), latent class growth modeling (LCGM), linear mixed-effect (LME), and variance components (VC) The goal of these analyses was to test statistical approaches that use repeat measurements to increase genetic signal for variant identification. Analysis of longitudinal measurements in genetic epidemiology provides a methodological strategy for the understanding of changes affecting long-term averages and changes in complex disease phenotypes over time. The design of these longitudinal studies may provide additional phenotypic information regarding age of onset, allowing for. The computational routines utilized need to be properly evaluated, as advanced statistical methods such as generalized estimating equations (GEEs) and linear mixed-effect (LME) models that account for the addition of pedigree structure may not be scalable to large genetic data sets
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.