Abstract

BackgroundThe exponential increase of published biomedical literature prompts the use of text mining tools to manage the information overload automatically. One of the most common applications is to mine protein-protein interactions (PPIs) from PubMed abstracts. Currently, most tools in mining PPIs from literature are using co-occurrence-based approaches or rule-based approaches. Hybrid methods (frame-based approaches) by combining these two methods may have better performance in predicting PPIs. However, the predicted PPIs from these methods are rarely evaluated by known PPI databases and co-occurred terms in Gene Ontology (GO) database.Methodology/Principal Findings We here developed a web-based tool, PPI Finder, to mine human PPIs from PubMed abstracts based on their co-occurrences and interaction words, followed by evidences in human PPI databases and shared terms in GO database. Only 28% of the co-occurred pairs in PubMed abstracts appeared in any of the commonly used human PPI databases (HPRD, BioGRID and BIND). On the other hand, of the known PPIs in HPRD, 69% showed co-occurrences in the literature, and 65% shared GO terms.ConclusionsPPI Finder provides a useful tool for biologists to uncover potential novel PPIs. It is freely accessible at http://liweilab.genetics.ac.cn/tm/.

Highlights

  • With the overwhelming amount and exponential increase of biomedical literature, it is almost impossible for biologists to keep abreast of all the updated information in their research fields

  • protein-protein interactions (PPIs) Finder provides a useful tool for biologists to uncover potential novel PPIs

  • We developed a novel algorithm by a frame-based approach for a web-based tool, PPI Finder, which can find the related genes of the gene of interest based on their co-occurrence frequencies and extract the semantic descriptions of interactions from the co-occurring literature by computational linguistic methods

Read more

Summary

Introduction

With the overwhelming amount and exponential increase of biomedical literature, it is almost impossible for biologists to keep abreast of all the updated information in their research fields. We developed a novel algorithm by a frame-based approach for a web-based tool, PPI Finder, which can find the related genes of the gene of interest based on their co-occurrence frequencies and extract the semantic descriptions of interactions from the co-occurring literature by computational linguistic methods.

Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.