Abstract

The exponential growth of genomic/genetic data in the era of Big Data demands new solutions for making these data findable, accessible, interoperable and reusable. In this article, we present a web-based platform named Gene Expression Time-Course Research (GETc) Platform that enables the discovery and visualization of time-course gene expression data and analytical results from the NIH/NCBI-sponsored Gene Expression Omnibus (GEO). The analytical results are produced from an analytic pipeline based on the ordinary differential equation model. Furthermore, in order to extract scientific insights from these results and disseminate the scientific findings, close and efficient collaborations between domain-specific experts from biomedical and scientific fields and data scientists is required. Therefore, GETc provides several recommendation functions and tools to facilitate effective collaborations. GETc platform is a very useful tool for researchers from the biomedical genomics community to present and communicate large numbers of analysis results from GEO. It is generalizable and broadly applicable across different biomedical research areas. GETc is a user-friendly and efficient web-based platform freely accessible at http://genestudy.org/

Highlights

  • Over the past few decades, substantial funding and resources have been invested to generate biomedical datasets at many levels, ranging from nucleic acid and gene level to population level, in order to understand, treat and prevent various diseases, and protect public health

  • Gene expression data from various diseases under different experimental conditions are mostly deposited in the National Institute of Health (NIH)/NCBI-sponsored Gene Expression Omnibus (GEO) data repository [1]

  • We developed a novel research platform called Gene Expression Time-Course Research (GETc) for sharing data and analytic results of timecourse gene expression datasets from GEO to improve the dataset reusability

Read more

Summary

Introduction

Over the past few decades, substantial funding and resources have been invested to generate biomedical datasets at many levels, ranging from nucleic acid and gene level to population level, in order to understand, treat and prevent various diseases, and protect public health. Like many of the biomedical databases, GEO was originally created as a data repository to comply with the data sharing policies. We present the three main parts of our work, (i) repositories developed for archiving datasets in the biomedical domains and their metadata, (ii) an analytic pipeline developed for analyzing gene data and (iii) dataset, literature and collaborator recommendation systems. It is a growing trend among the researchers to make their data publicly available for reproducibility and data reusability. Users from the biomedical community have to visit and search each repository separately to find data for their research, which can be time-consuming and hectic

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call