Abstract

OAHG, an integrated resource, aims to establish a comprehensive functional annotation resource for human protein-coding genes (PCGs), miRNAs, and lncRNAs by multi-level ontologies involving Gene Ontology (GO), Disease Ontology (DO), and Human Phenotype Ontology (HPO). Many previous studies have focused on inferring putative properties and biological functions of PCGs and non-coding RNA genes from different perspectives. During the past several decades, a few of databases have been designed to annotate the functions of PCGs, miRNAs, and lncRNAs, respectively. A part of functional descriptions in these databases were mapped to standardize terminologies, such as GO, which could be helpful to do further analysis. Despite these developments, there is no comprehensive resource recording the function of these three important types of genes. The current version of OAHG, release 1.0 (Jun 2016), integrates three ontologies involving GO, DO, and HPO, six gene functional databases and two interaction databases. Currently, OAHG contains 1,434,694 entries involving 16,929 PCGs, 637 miRNAs, 193 lncRNAs, and 24,894 terms of ontologies. During the performance evaluation, OAHG shows the consistencies with existing gene interactions and the structure of ontology. For example, terms with more similar structure could be associated with more associated genes (Pearson correlation γ2 = 0.2428, p < 2.2e–16).

Highlights

  • OAHG, an integrated resource, aims to establish a comprehensive functional annotation resource for human protein-coding genes (PCGs), microRNA genes (miRNAs), and long non-coding RNA genes (lncRNAs) by multi-level ontologies involving Gene Ontology (GO), Disease Ontology (DO), and Human Phenotype Ontology (HPO)

  • Accumulating evidence indicated that more number of these types of human genes exists, especially microRNA genes and long non-coding RNA genes[14,15], which raises the urgency of identifying the function of miRNAs and lncRNAs16,17

  • LncRNAs could be associated with the minimum number of terms (728 terms), 715 terms of which could be associated with miRNAs and protein-coding RNA genes (PCGs)

Read more

Summary

Introduction

OAHG, an integrated resource, aims to establish a comprehensive functional annotation resource for human protein-coding genes (PCGs), miRNAs, and lncRNAs by multi-level ontologies involving Gene Ontology (GO), Disease Ontology (DO), and Human Phenotype Ontology (HPO). Gene-term associations are from the integrated resource OAHG, interactions between genes are from Human Protein Reference Database (HPRD)[28] and starBase v2.029, and similarity of genes is calculated by Jaccard index. To further verify the superiority of the integrated resource, the performance of gene-term associations based on each one of HPO, DO, and GO was evaluated by the consistency of the similarity of genes by their associated terms and the similarity of genes by their interactive genes.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call