Abstract

We investigated the interconnection on knowledge of biological molecules, biological phenomena, and diseases to efficiently collect information regarding the functions of chemical compounds and gene products, roles, applications, and involvements in diseases using knowledge graphs (KGs) developed from Resource Description Framework (RDF) data and ontologies. NikkajiRDF linked open data provide information on approximately 3.5 million chemical compounds and 694 application examples. We integrated NikkajiRDF with Interlinking Ontology for Biological Concepts (IOBC), including approximately 80,000 concepts, information on gene products, drugs, and diseases. Using IOBC’s ontological structure, we confirmed that this integration enabled us to infer new information regarding biological and chemical functions, applications, and involvements in diseases for 5038 chemical compounds. Furthermore, we developed KGs from IOBC and added protein, biological phenomena, and disease identifiers used in major biological databases: UniProt, Gene Ontology, and MeSH to the KGs. Using the extended KGs and federated search to the DisGeNET, we discovered more than 60 chemicals and 700 gene products, involved in 32 diseases.

Highlights

  • Information on functions and physicochemical qualities of biological molecules, such as chemical compounds and gene products, is essential for elucidating and recognizing biological phenomena and the development of various biobased products, for example, drugs, foods, and materials

  • We demonstrate the inference of chemical compounds and gene products in biological phenomena and diseases using the knowledge graphs (KGs)

  • Future studies should validate these inferred results biologically and clinically. We investigated whether these disease-related chemical compounds, which were inferred in the Fibrinolysis network (Fig. 6 and 7), have been authorized as disease drugs using the comparative toxicogenomics database (CTD) [62] and PubChem

Read more

Summary

Introduction

Information on functions and physicochemical qualities of biological molecules, such as chemical compounds and gene products, is essential for elucidating and recognizing biological phenomena and the development of various biobased products, for example, drugs, foods, and materials. Interlinking Ontology for Biological Concepts (IOBC), previously referred to as the “Refined JST thesaurus” [11], contains approximately 80,000 biological concepts, including biological phenomena, diseases, molecular functions, gene products, chemical compounds, drugs, and medical procedures It contains approximately 20,000 related concepts in basic chemistry and environmental science [12]. IOBC contains various biological phenomena, including diseases, chemical compounds, drugs, and gene products These items lack the unique identifiers, such as InChI/InChIKey and Protein IDs (e.g., UniProtKB accession number [15]), used for easy mapping of biological molecules and drugs of other data resources. These data sources should be combined to efficiently collect the functions/roles/applications.

Related Works
Conclusions
59. Gene Ontology Consortium
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call