Abstract

Genomic data interpretation often requires analyses that move from a gene-by-gene focus to a focus on sets of genes that are associated with biological phenomena such as molecular processes, phenotypes, diseases, drug interactions or environmental conditions. Unique challenges exist in the curation of gene sets beyond the challenges in curation of individual genes. Here we highlight a literature curation workflow whereby gene sets are curated from peer-reviewed published data into GeneWeaver (GW), a data repository and analysis platform. We describe the system features that allow for a flexible yet precise curation procedure. We illustrate the value of curation by gene sets through analysis of independently curated sets that relate to the integrated stress response, showing that sets curated from independent sources all share significant Jaccard similarity. A suite of reproducible analysis tools is provided in GW as services to carry out interactive functional investigation of user-submitted gene sets within the context of over 150 000 gene sets constructed from publicly available resources and published gene lists. A curation interface supports the ability of users to design and maintain curation workflows of gene sets, including assigning, reviewing and releasing gene sets within a curation project context.

Highlights

  • Biocuration plays a central role in biomedical research and public data resources, such as the Gene Ontology (GO)

  • Empirical gene sets are supplemented by data automatically aggregated from public resources such as ontological annotations by term from GO, Disease Ontology and Mammalian Phenotype Ontology (MP); pathway databases [e.g. Kyoto Encyclopedia of Genes and Genomes (KEGG) and Pathway Commons (PC)]; and curated repositories (e.g. Comparative Toxicogenomics Database and Molecular Signature Database)

  • Positive regulation of autophagy Genes down-regulated in livers of Eif2ak4-mutant mice perfused with all amino acids minus methionine Genes down-regulated in livers of Eif2ak3-mutant mice treated with tBuHQ Genes up-regulated in amino acid-starved N25/2 cells Atf4 target genes Genes up-regulated in amino acid-starved fibroblasts stress response (ISR) pathway, curating genes sets relevant to the execution of the pathway and comparing those gene sets using the Jaccard analysis tool

Read more

Summary

Introduction

Biocuration plays a central role in biomedical research and public data resources, such as the Gene Ontology (GO). It is designed to harmonize results from disparate experimental systems and functional genomics data including, but not limited to, differential expression profiling, genome-wide association studies (GWAS), gene networks and literature curation.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call