Abstract

Exploratory data-driven multivariate analysis provides a means of investigating underlying structure in complex data. To explore the stability of multivariate data modeling, we have applied a common method of multivariate modeling (factor analysis) to the Genetic Analysis Workshop 13 (GAW13) Framingham Heart Study data. Given the longitudinal nature of the data, multivariate models were generated independently for a number of different time points (corresponding to cross-sectional clinic visits for the two cohorts), and compared. In addition, each multivariate model was used to generate factor scores, which were then used as a quantitative trait in variance component-based linkage analysis to investigate the stability of linkage signals over time. We found surprisingly good correlation between factor models (i.e., predicted factor structures), maximum LOD scores, and locations of maximum LOD scores (0.81< ρ <0.94 for factor scores; ρ >0.99 for peak locations; and 0.67< ρ <0.93 for peak LOD scores). Furthermore, the regions implicated by linkage analysis with these factor scores have also been observed in other studies, further validating our exploratory modeling.

Highlights

  • When examining large amounts of data with many correlated variables, a common approach is to employ dimensionality-reducing techniques, such as clustering methods, principle component analysis, or common factor analysis

  • These studies conclude that recognition of pleiotropic effects on multiple measured traits and application of appropriate multivariate analysis methods results in increased power to detect a genetic effect compared with univariate analysis of the individual measured traits

  • Residuals from each regression were used as inputs to factor analysis, generating four sets of factor structures

Read more

Summary

Introduction

When examining large amounts of data with many correlated variables, a common approach is to employ dimensionality-reducing techniques, such as clustering methods, principle component analysis, or common factor analysis Each of these methods attempts to reduce the complexity in large systems by creating combinations of variables that reflect underlying, unobservable structures inherent in the data. Applying these methods to understand the structure of complex data falls into the realm of exploratory data analysis [1], and is commonly followed by a confirmatory phase, in which the reproducibility of the results is investigated Multivariable modeling of this type has been used to identify genetic latent factors in studies of twins and family data [2]. These studies conclude that recognition of pleiotropic effects on multiple measured traits and application of appropriate multivariate analysis methods results in increased power to detect a genetic effect compared with univariate (simple) analysis of the individual measured traits

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.