Abstract

BackgroundThe Interaction Network Ontology (INO) logically represents biological interactions, pathways, and networks. INO has been demonstrated to be valuable in providing a set of structured ontological terms and associated keywords to support literature mining of gene-gene interactions from biomedical literature. However, previous work using INO focused on single keyword matching, while many interactions are represented with two or more interaction keywords used in combination.MethodsThis paper reports our extension of INO to include combinatory patterns of two or more literature mining keywords co-existing in one sentence to represent specific INO interaction classes. Such keyword combinations and related INO interaction type information could be automatically obtained via SPARQL queries, formatted in Excel format, and used in an INO-supported SciMiner, an in-house literature mining program. We studied the gene interaction sentences from the commonly used benchmark Learning Logic in Language (LLL) dataset and one internally generated vaccine-related dataset to identify and analyze interaction types containing multiple keywords. Patterns obtained from the dependency parse trees of the sentences were used to identify the interaction keywords that are related to each other and collectively represent an interaction type.ResultsThe INO ontology currently has 575 terms including 202 terms under the interaction branch. The relations between the INO interaction types and associated keywords are represented using the INO annotation relations: ‘has literature mining keywords’ and ‘has keyword dependency pattern’. The keyword dependency patterns were generated via running the Stanford Parser to obtain dependency relation types. Out of the 107 interactions in the LLL dataset represented with two-keyword interaction types, 86 were identified by using the direct dependency relations. The LLL dataset contained 34 gene regulation interaction types, each of which associated with multiple keywords. A hierarchical display of these 34 interaction types and their ancestor terms in INO resulted in the identification of specific gene-gene interaction patterns from the LLL dataset. The phenomenon of having multi-keyword interaction types was also frequently observed in the vaccine dataset.ConclusionsBy modeling and representing multiple textual keywords for interaction types, the extended INO enabled the identification of complex biological gene-gene interactions represented with multiple keywords.Electronic supplementary materialThe online version of this article (doi:10.1186/s13040-016-0118-0) contains supplementary material, which is available to authorized users.

Highlights

  • The Interaction Network Ontology (INO) logically represents biological interactions, pathways, and networks

  • INO representation of complex interaction types As defined previously, INO is aligned with the upper-level Basic Formal Ontology (BFO) [8]

  • Out of the 107 interactions in the Learning Logic in Language (LLL) dataset represented with two-keyword interaction types, 86 related keyword pairs were identified by using the direct dependency relations

Read more

Summary

Introduction

The Interaction Network Ontology (INO) logically represents biological interactions, pathways, and networks. We have shown that the usage of ontologies, such as the Vaccine Ontology (VO), can enhance the mining of gene-gene interactions under a specific domain, for example, the vaccine domain [3, 4] or vaccineinduced fever domain [5]. These over 800 interaction-associated keywords provide us tags for mining interaction relations between two genes or proteins. This is basically a binary result of an interaction between two molecules or entities. Two entities are classified as interacting or not interacting

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.