Abstract

BackgroundAutomatic extraction of biomedical events from literature is an important task in the understanding biological systems, allowing for faster update of the latest discoveries automatically. Detecting trigger words which indicate events is a critical step in the process of event extraction, because following steps depend on the recognized triggers. The task in this study is to identify event triggers from the literature across multiple levels of biological organization. In order to achieve high performances, the machine learning based approaches, such as neural networks, must be trained on a dataset with plentiful annotations. However, annotations might be difficult to obtain on the multiple levels, and annotated resources have so far mainly focused on the relations and processes at the molecular level. In this work, we aim to apply transfer learning for multiple-level trigger recognition, in which a source dataset with sufficient annotations on the molecular level is utilized to improve performance on a target domain with insufficient annotations and more trigger types.ResultsWe propose a generalized cross-domain neural network transfer learning architecture and approach, which can share as much knowledge as possible between the source and target domains, especially when their label sets overlap. In the experiments, MLEE corpus is used to train and test the proposed model to recognize the multiple-level triggers as a target dataset. Two different corpora having the varying degrees of overlapping labels with MLEE from the BioNLP’09 and BioNLP’11 Shared Tasks are used as source datasets, respectively. Regardless of the degree of overlap, our proposed approach achieves recognition improvement. Moreover, its performance exceeds previously reported results of other leading systems on the same MLEE corpus.ConclusionsThe proposed transfer learning method can further improve the performance compared with the traditional method, when the labels of the source and target datasets overlap. The most essential reason is that our approach has changed the way parameters are shared. The vertical sharing replaces the horizontal sharing, which brings more sharable parameters. Hence, these more shared parameters between networks improve the performance and generalization of the model on the target domain effectively.

Highlights

  • Automatic extraction of biomedical events from literature is an important task in the understanding biological systems, allowing for faster update of the latest discoveries automatically

  • Corpus description An in-depth investigation is carried out to compare the performance of our proposed Multiple-Level event Trigger recogNizer, MLTrigNer, which is built based on the generalized cross-domain transfer learning BiLSTMCRF model

  • In the trigger types of DataMLEE, the labels overlapped with DataST09 are marked using ‘*’, and the labels overlapped with DataEPI11 are marked using ‘+’

Read more

Summary

Introduction

Automatic extraction of biomedical events from literature is an important task in the understanding biological systems, allowing for faster update of the latest discoveries automatically. We aim to apply transfer learning for multiple-level trigger recognition, in which a source dataset with sufficient annotations on the molecular level is utilized to improve performance on a target domain with insufficient annotations and more trigger types. There are incremental studies in applying Text Mining (TM) techniques for automatic recognizing and tracking of the new discoveries and theories in these biomedical articles. These biomedical TM applications include named entity Event extraction refers to automatically extracting structured representations of biomedical relations, functions and processes from text [3]. When a certain protein regulates the expression of a certain gene and its products are in turn

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call