Abstract

BackgroundBiomedical event extraction is one of the most frontier domains in biomedical research. The two main subtasks of biomedical event extraction are trigger identification and arguments detection which can both be considered as classification problems. However, traditional state-of-the-art methods are based on support vector machine (SVM) with massive manually designed one-hot represented features, which require enormous work but lack semantic relation among words.MethodsIn this paper, we propose a multiple distributed representation method for biomedical event extraction. The method combines context consisting of dependency-based word embedding, and task-based features represented in a distributed way as the input of deep learning models to train deep learning models. Finally, we used softmax classifier to label the example candidates.ResultsThe experimental results on Multi-Level Event Extraction (MLEE) corpus show higher F-scores of 77.97% in trigger identification and 58.31% in overall compared to the state-of-the-art SVM method.ConclusionsOur distributed representation method for biomedical event extraction avoids the problems of semantic gap and dimension disaster from traditional one-hot representation methods. The promising results demonstrate that our proposed method is effective for biomedical event extraction.

Highlights

  • Biomedical event extraction is one of the most frontier domains in biomedical research

  • Our main work focuses on using the deep learning model with the distributed representation method on trigger and arguments detection to improve the accuracy of biomedical event extraction

  • Datasets and evaluation metrics We evaluated our proposed method on Multi-Level Event Extraction (MLEE) corpus, which spanned all levels of biomedical organization and covered 19 types of triggers

Read more

Summary

Methods

Biomedical event extraction is to extract complex biomedical relations between biomedical entities. Our main work focuses on using the deep learning model with the distributed representation method on trigger and arguments detection to improve the accuracy of biomedical event extraction. Event trigger identification The main problem of trigger identification is to predict the trigger type of every word in a sentence This can be represented as a classification problem shown as follow:. Based on our previous work [15], we adopt the distance between the trigger and entities in the dependency tree, and the distance is shown as follows: ÀÁ ψdis TiEj 1⁄4 〈Wdis〉1⁄2disðTi;;Ejފ. We employed CNN to model the sentence based on the dependency path between the trigger and candidates to detect trigger-argument or trigger-trigger relation. We adopted the word on the dependency-path between the trigger and candidates to represent the logical relation of them. One is the distance from word wi in S to the first entity (or trigger), the other one is from wi to the second entity (or trigger) shown as follows respectively:

Conclusions
Background
Method
Results and discussion
Conclusion and future work

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.