An unbiased index to quantify participant\u2019s phenotypic contribution to an open-access cohort

Yingleong Chan,Alexander W Zaranek,Sarah W Zaranek,Michael Tung,George M Church,Jeantine E Lunshof,Alexander S Garruss,Michael F Chou,Elaine T Lim,Ying Kai Chan,Madeleine P Ball

doi:10.1038/srep46148

Abstract

The Personal Genome Project (PGP) is an effort to enroll many participants to create an open-access repository of genome, health and trait data for research. However, PGP participants are not enrolled for studying any specific traits and participants choose the phenotypes to disclose. To measure the extent and willingness and to encourage and guide participants to contribute phenotypes, we developed an algorithm to score and rank the phenotypes and participants of the PGP. The scoring algorithm calculates the participation index (P-index) for every participant, where 0 indicates no reported phenotypes and 100 indicate complete phenotype reporting. We calculated the P-index for all 5,015 participants in the PGP and they ranged from 0 to 96.7. We found that participants mainly have either high scores (P-index > 90, 29.5%) or low scores (P-index < 10, 57.8%). While, there are significantly more males than female participants (1,793 versus 1,271), females tend to have on average higher P-indexes (P = 0.015). We also reported the P-indexes of participants based on demographics and states like Missouri and Massachusetts have better P-indexes than states like Utah and Minnesota. The P-index can therefore be used as an unbiased way to measure and rank participant’s phenotypic contribution towards the PGP.

Highlights

We explored the landscape of phenotypes available in the Personal Genome Project (PGP) and how extensive they are using a scoring algorithm, which is unbiased towards any particular phenotype
Phenotypes were considered valid if 2 or more participants reported valid values for that phenotype, if the phenotype does not pertain to genotyping information and if they meet our other filtering criteria
We described a method for ranking phenotypes and participants in databases used for research and applied the method to the Harvard Personal Genome Project (PGP)

Summary

Introduction

The data is made open-access, allowing anyone to use the genotype and phenotype data for research, accelerating the process of using data from large cohorts of individuals for research[16] These participants have consented the public sharing of their genotype and phenotype data for research purposes, and can be re-contacted for additional follow-up study. In determining the P-index, the algorithm allocates more weight to phenotypes that are provided by many participants and gives less weight to phenotypes that are provided by only fewer participants This is because having more participants with a specific phenotype increases the statistical power for discovering a meaningful genetic association[17,18,19]. We partitioned the participants demographically and reported their P-indexes based on states and zip codes Using this scoring algorithm, we investigated the landscape of phenotype data available in the PGP, as well as the willingness of participants in providing phenotype data. Our algorithm can be used to incentivize and guide participants (See Discussion) in sharing more phenotype data and can be applied to other projects structured like the PGP in reaching out to participants for sharing phenotypes

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Reports	Publication Date: Apr 7, 2017
Citations: 5	License type: open-access

R Discovery Prime

R Discovery Prime

An unbiased index to quantify participant\u2019s phenotypic contribution to an open-access cohort

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports

Lead the way for us

Similar Papers

Personal genomes
Euan Ashley
The Lancet | VOL. 376
Euan AshleyEuan Ashley
01 Oct 2010
The Lancet | VOL. 376

A probabilistic model to predict clinical phenotypic traits from genome sequencing.
Yun-Ching Chen ... Peter D Stenson
PLoS computational biology | VOL. 10
Yun-Ching Chen, et. al.Yun-Ching Chen ... Peter D Stenson
04 Sep 2014
PLoS computational biology | VOL. 10

Enabling Responsible Public Genomics
John M Conley ... Daniel B Vorhaus
SSRN Electronic Journal | VOL. 20
John M Conley, et. al.John M Conley ... Daniel B Vorhaus
15 Sep 2009
SSRN Electronic Journal | VOL. 20

Eyes Wide Open: The Personal Genome Project, Citizen Science and Veracity in Informed Consent
Misha Angrist
Personalized Medicine | VOL. 6
Misha AngristMisha Angrist
01 Nov 2009
Personalized Medicine | VOL. 6

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An unbiased index to quantify participant\u2019s phenotypic contribution to an open-access cohort

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Reports