Abstract

Advances in biomedical technology and research have resulted in a large number of research findings, which are primarily published in unstructured text such as journal articles. Text mining techniques have been thus employed to extract knowledge from such data. In this article we focus on the task of identifying and extracting relations between bio-entities such as green tea and breast cancer. Unlike previous work that employs heuristics such as co-occurrence patterns and handcrafted syntactic rules, we propose a verb-centric algorithm. This algorithm identifies and extracts the main verb(s) in a sentence, therefore, it does not require the usage of predefined rules or patterns. Using the main verb(s) it then extracts the two involved entities of a relationship. The biomedical entities are identified using a dependence parse tree by applying syntactic and linguistic features such as preposition phrases and semantic role analysis. The proposed verb-centric approach can effectively handle complex sentence structures such as clauses and conjunctive sentences. We evaluate the algorithm on several data sets and achieve an average F-score of 0.905, which is significantly higher than that of previous work.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.