UkbREST: efficient and streamlined data access for reproducible research in large biobanks.

Milton Pividori,Hae Kyung Im

doi:10.1093/bioinformatics/bty925

Abstract

SummaryLarge biobanks, such as UK Biobank with half a million participants, are changing the scale and availability of genotypic and phenotypic data for researchers to ask fundamental questions about the biology of health and disease. The breadth of the UK Biobank data is enabling discoveries at an unprecedented pace. However, this size and complexity pose new challenges to investigators who need to keep the accruing data up to date, comply with potential consent changes, and efficiently and reproducibly extract subsets of the data to answer specific scientific questions. Here we propose a tool called ukbREST designed for the UK Biobank study (easily extensible to other biobanks), which allows authorized users to efficiently retrieve phenotypic and genetic data. It exposes a REST API that makes data highly accessible inside a private and secure network, allowing the data specification in a human readable text format easily shareable with other researchers. These characteristics make ukbREST an important tool to make biobank’s valuable data more readily accessible to the research community and facilitate reproducibility of the analysis, a key aspect of science.Availability and implementationIt is implemented in Python using the Flask-RESTful framework for the API, and it is under the MIT license. It works with PostgreSQL and a Docker image is available for easy deployment. The source code and documentation is available in Github: https://github.com/hakyimlab/ukbrest.

Highlights

Large-scale biobanks provide invaluable resources to the scientific community to investigate the causes of disease (Gaziano et al, 2016; Kvale et al, 2015; Sudlow et al, 2015)
UK Biobank, the most mature of them, is a prospective study of the health of individuals based in the UK (Bycroft et al, 2018)
In the UK Biobank, a data-field is identified with an ID followed by two additional indices: instance and array

Summary

Introduction

Large-scale biobanks provide invaluable resources to the scientific community to investigate the causes of disease (Gaziano et al, 2016; Kvale et al, 2015; Sudlow et al, 2015). Given this complexity, maintenance and reproducible phenotype and covariate extraction can be challenging. Maintenance and reproducible phenotype and covariate extraction can be challenging To address these problems we developed ukbREST, a user friendly tool that enables researchers to efficiently load the UK Biobank data into an SQL database, query any data-field and reproducibly document the phenotypes derived. The ukbREST server is started and it is ready to receive queries by any authorized user using the REST API (with authentication and encryption capabilities)

Reproducible phenotype specification

Security

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Bioinformatics	Publication Date: Nov 5, 2018
Citations: 7	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

UkbREST: efficient and streamlined data access for reproducible research in large biobanks.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bioinformatics

Lead the way for us

Similar Papers

Consent and anonymization in research involving biobanks: differing terms and norms present serious barriers to an international framework.
Bernice S Elger ... Arthur L Caplan
EMBO reports | VOL. 7
Bernice S Elger, et. al.Bernice S Elger ... Arthur L Caplan
01 Jul 2006
EMBO reports | VOL. 7

Association between neutrophil to lymphocyte ratio and Alzheimer’s Disease in large biobank cohorts
Emily Drzymalla ... Laura Raffield
Alzheimer's & Dementia | VOL. 19
Emily Drzymalla, et. al.Emily Drzymalla ... Laura Raffield
01 Dec 2023
Alzheimer's & Dementia | VOL. 19

Rare Hypomorphic Sucrase Isomaltase Variants in Relation to Irritable Bowel Syndrome Risk in UK Biobank
Tenghao Zheng ... Mauro D’Amato
Gastroenterology | VOL. 161
Tenghao Zheng, et. al.Tenghao Zheng ... Mauro D’Amato
26 Jun 2021
Gastroenterology | VOL. 161

Delimiting species in the taxonomically challenging orchid section Pseudophrys: Bayesian analyses of genetic and phenotypic data
Nina Joffard ... Bruno Buatois
Frontiers in Ecology and Evolution | VOL. 10
Nina Joffard, et. al.Nina Joffard ... Bruno Buatois
23 Nov 2022
Frontiers in Ecology and Evolution | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

UkbREST: efficient and streamlined data access for reproducible research in large biobanks.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Bioinformatics