Abstract

BackgroundLiterature mining of gene-gene interactions has been enhanced by ontology-based name classifications. However, in biomedical literature mining, interaction keywords have not been carefully studied and used beyond a collection of keywords.MethodsIn this study, we report the development of a new Interaction Network Ontology (INO) that classifies >800 interaction keywords and incorporates interaction terms from the PSI Molecular Interactions (PSI-MI) and Gene Ontology (GO). Using INO-based literature mining results, a modified Fisher’s exact test was established to analyze significantly over- and under-represented enriched gene-gene interaction types within a specific area. Such a strategy was applied to study the vaccine-mediated gene-gene interactions using all PubMed abstracts. The Vaccine Ontology (VO) and INO were used to support the retrieval of vaccine terms and interaction keywords from the literature.ResultsINO is aligned with the Basic Formal Ontology (BFO) and imports terms from 10 other existing ontologies. Current INO includes 540 terms. In terms of interaction-related terms, INO imports and aligns PSI-MI and GO interaction terms and includes over 100 newly generated ontology terms with ‘INO_’ prefix. A new annotation property, ‘has literature mining keywords’, was generated to allow the listing of different keywords mapping to the interaction types in INO. Using all PubMed documents published as of 12/31/2013, approximately 266,000 vaccine-associated documents were identified, and a total of 6,116 gene-pairs were associated with at least one INO term. Out of 78 INO interaction terms associated with at least five gene-pairs of the vaccine-associated sub-network, 14 terms were significantly over-represented (i.e., more frequently used) and 17 under-represented based on our modified Fisher’s exact test. These over-represented and under-represented terms share some common top-level terms but are distinct at the bottom levels of the INO hierarchy. The analysis of these interaction types and their associated gene-gene pairs uncovered many scientific insights.ConclusionsINO provides a novel approach for defining hierarchical interaction types and related keywords for literature mining. The ontology-based literature mining, in combination with an INO-based statistical interaction enrichment test, provides a new platform for efficient mining and analysis of topic-specific gene interaction networks.

Highlights

  • Literature mining of gene-gene interactions has been enhanced by ontology-based name classifications

  • The Interaction Network Ontology (INO) (1) INO overall design and hierarchy INO is a biomedical ontology in the domain of molecular interactions and interaction networks

  • INO is aligned with the upper-level Basic Formal Ontology (BFO) [17] (Figure 1)

Read more

Summary

Introduction

Literature mining of gene-gene interactions has been enhanced by ontology-based name classifications. Two common strategies of literature retrieval of reported gene-gene interactions include gene-gene cooccurrence and interaction keywords-based literature mining. The gene-gene interaction represents a broad interactive relation between two genes or gene products [1]. The co-occurrence strategy identifies two related genes both listed in the same literature, or in the same title, abstract, or sentence. An example of such a strategy is PubGene, which extracts gene relationships based on the co-occurrence of gene symbols in MEDLINE titles and abstracts [2]. The other strategy relies on the identification of two genes together with an interaction keyword in the same sentence. To improve the interaction keyword-based approach, machine learning algorithms (e.g., support vector machine (SVM) [3]) with features extracted from syntactic analysis of sentences (e.g., dependency parse trees) can be used [4]

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.