Abstract

Human geneticists are increasingly turning to study designs based on very large sample sizes to overcome difficulties in studying complex disorders. This in turn almost always requires multi-site data collection and processing of data through centralized repositories. While such repositories offer many advantages, including the ability to return to previously collected data to apply new analytic techniques, they also have some limitations. To illustrate, we reviewed data from seven older schizophrenia studies available from the NIMH-funded Center for Collaborative Genomic Studies on Mental Disorders, also known as the Human Genetics Initiative (HGI), and assessed the impact of data cleaning and regularization on linkage analyses. Extensive data regularization protocols were developed and applied to both genotypic and phenotypic data. Genome-wide nonparametric linkage (NPL) statistics were computed for each study, over various stages of data processing. To assess the impact of data processing on aggregate results, Genome-Scan Meta-Analysis (GSMA) was performed. Examples of increased, reduced and shifted linkage peaks were found when comparing linkage results based on original HGI data to results using post-processed data within the same set of pedigrees. Interestingly, reducing the number of affected individuals tended to increase rather than decrease linkage peaks. But most importantly, while the effects of data regularization within individual data sets were small, GSMA applied to the data in aggregate yielded a substantially different picture after data regularization. These results have implications for analyses based on other types of data (e.g., case-control GWAS or sequencing data) as well as data obtained from other repositories.

Highlights

  • For the past two decades, NIMH-funded investigators conducting genetic research have been strongly encouraged to contribute biospecimens, along with whatever corresponding genotypic and phenotypic information they have assembled, to a centralized repository housed at the Center for Collaborative Genomic Studies on Mental Disorders at Rutgers University and Washington University

  • The Human Genetics Initiative (HGI) is an enormous resource for the psychiatric genetics community, insofar as it facilitates joint analysis of multiple studies, allowing analyses based on far larger sample sizes than can be accomplished by any one research project. (Note that the HGI has recently been renamed the NIMH Repository and Genomics Resource.)

  • We report here results from the Combined Analysis of Psychiatric Studies (CAPS), a collaborative project with the HGI, a primary aim of which is to review and regularize HGI data across multiple studies, returning to the community sets of data configured for cross-study analysis

Read more

Summary

Introduction

For the past two decades, NIMH-funded investigators conducting genetic research have been strongly encouraged to contribute biospecimens, along with whatever corresponding genotypic and phenotypic information they have assembled, to a centralized repository housed at the Center for Collaborative Genomic Studies on Mental Disorders at Rutgers University and Washington University. The repository grows immortalized cell lines, supplies DNA to researchers, and provides downloadable copies of clinical and genotypic data files through the Human Genetics Initiative (HGI, nimhgenetics.org). While errors in their own data are likely discovered by researchers before publication, there are no formal mechanisms to ensure corresponding corrections in repository files after the initial data deposit. It is frequently not possible to completely reconcile simple quantities of data (number of cases or families) between HGI files and published reports describing the same data sets

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.