Abstract

BackgroundDetermining the semantic relatedness of two biomedical terms is an important task for many text-mining applications in the biomedical field. Previous studies, such as those using ontology-based and corpus-based approaches, measured semantic relatedness by using information from the structure of biomedical literature, but these methods are limited by the small size of training resources. To increase the size of training datasets, the outputs of search engines have been used extensively to analyze the lexical patterns of biomedical terms.Methodology/Principal FindingsIn this work, we propose the Mutually Reinforcing Lexical Pattern Ranking (ReLPR) algorithm for learning and exploring the lexical patterns of synonym pairs in biomedical text. ReLPR employs lexical patterns and their pattern containers to assess the semantic relatedness of biomedical terms. By combining sentence structures and the linking activities between containers and lexical patterns, our algorithm can explore the correlation between two biomedical terms.Conclusions/SignificanceThe average correlation coefficient of the ReLPR algorithm was 0.82 for various datasets. The results of the ReLPR algorithm were significantly superior to those of previous methods.

Highlights

  • Semantic relatedness has become increasingly important for the text-mining community in recent years, especially in the biomedical field

  • Reinforcing Lexical Pattern Ranking (ReLPR) algorithm is designed for learning the lexical patterns of biomedical synonym pairs, and it determines the semantic relatedness of concept pairs

  • There are four stages of training a synonym lexical pattern database: (1) the synonym pairs are collected; (2) the snippets of synonym pairs are retrieved from search engines; (3) the scores of lexical patternvoting are extracted by the ReLPR algorithm for each lexical pattern; and (4) a synonym lexical pattern database is constructed

Read more

Summary

Introduction

Semantic relatedness has become increasingly important for the text-mining community in recent years, especially in the biomedical field. The existence of various relations has created additional challenges for understanding biomedical terms. Determining the semantic relatedness of two biomedical terms is an important task for many text-mining applications in the biomedical field. Previous studies, such as those using ontology-based and corpus-based approaches, measured semantic relatedness by using information from the structure of biomedical literature, but these methods are limited by the small size of training resources. To increase the size of training datasets, the outputs of search engines have been used extensively to analyze the lexical patterns of biomedical terms

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.