FAIR compliant database development for human microbiome data samples.

Mathieu Dorst,Nathan Zeevenhooven,Daniel Mende,Rory Wilding,Egija Zaura,Alfons Hoekstra,Bernd W Brandt,Vivek M Sheraton

doi:10.3389/fcimb.2024.1384809

Mathieu Dorst, Nathan Zeevenhooven + Show 6 more

Open Access

PDF Available

https://doi.org/10.3389/fcimb.2024.1384809

Copy DOI

Export

Save

Cite

Journal: Frontiers in Cellular and Infection Microbiology	Publication Date: May 7, 2024
Citations: 1	License type: CC BY 4.0

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Sharing microbiome data among researchers fosters new innovations and reduces cost for research. Practically, this means that the (meta)data will have to be standardized, transparent and readily available for researchers. The microbiome data and associated metadata will then be described with regards to composition and origin, in order to maximize the possibilities for application in various contexts of research. Here, we propose a set of tools and protocols to develop a real-time FAIR (Findable. Accessible, Interoperable and Reusable) compliant database for the handling and storage of human microbiome and host-associated data. The conflicts arising from privacy laws with respect to metadata, possible human genome sequences in the metagenome shotgun data and FAIR implementations are discussed. Alternate pathways for achieving compliance in such conflicts are analyzed. Sample traceable and sensitive microbiome data, such as DNA sequences or geolocalized metadata are identified, and the role of the GDPR (General Data Protection Regulation) data regulations are considered. For the construction of the database, procedures have been realized to make data FAIR compliant, while preserving privacy of the participants providing the data. An open-source development platform, Supabase, was used to implement the microbiome database. Researchers can deploy this real-time database to access, upload, download and interact with human microbiome data in a FAIR complaint manner. In addition, a large language model (LLM) powered by ChatGPT is developed and deployed to enable knowledge dissemination and non-expert usage of the database.

Full Text