Making adjustments to event annotations for improved biological event extraction.

Seung-Cheol Baek,Jong C Park

doi:10.1186/s13326-016-0094-9

Abstract

BackgroundCurrent state-of-the-art approaches to biological event extraction train statistical models in a supervised manner on corpora annotated with event triggers and event-argument relations. Inspecting such corpora, we observe that there is ambiguity in the span of event triggers (e.g., “transcriptional activity” vs. ‘transcriptional’), leading to inconsistencies across event trigger annotations. Such inconsistencies make it quite likely that similar phrases are annotated with different spans of event triggers, suggesting the possibility that a statistical learning algorithm misses an opportunity for generalizing from such event triggers.MethodsWe anticipate that adjustments to the span of event triggers to reduce these inconsistencies would meaningfully improve the present performance of event extraction systems. In this study, we look into this possibility with the corpora provided by the 2009 BioNLP shared task as a proof of concept. We propose an Informed Expectation-Maximization (EM) algorithm, which trains models using the EM algorithm with a posterior regularization technique, which consults the gold-standard event trigger annotations in a form of constraints. We further propose four constraints on the possible event trigger annotations to be explored by the EM algorithm.ResultsThe algorithm is shown to outperform the state-of-the-art algorithm on the development corpus in a statistically significant manner and on the test corpus by a narrow margin.ConclusionsThe analysis of the annotations generated by the algorithm shows that there are various types of ambiguity in event annotations, even though they could be small in number.

Highlights

Current state-of-the-art approaches to biological event extraction train statistical models in a supervised manner on corpora annotated with event triggers and event-argument relations
Current state-of-the-art approaches to biological event extraction train statistical models in a supervised learning manner on annotated corpora, where event triggers, or the expressions indicative of events, and event-argument relations, or relations between events and their participant, are annotated (e.g., [1, 2])
There would be similar phrases where the span of their counterparts of event triggers is differently annotated, and as a result, such event triggers are syntactically characterized in a different way, suggesting a possibility that a statistical learning algorithm is hard to generalize from such event triggers that are similar, but differently annotated in a training corpus

Summary

Methods

Following Björne and colleagues [5], we viewed the event extraction task as constructing directed graphs, where event triggers and event-argument relations are encoded with labeled nodes and edges, respectively. When turning to the label of edges, a question arises whether edges can be labeled with more than one role type, that is, whether an event takes a protein or another event both as THEME and CAUSE To answer this question, we constructed graphs for sentences in the training corpus of 800 annotated abstracts with the Head-Word rule. It begins with an initial model with all weights set to 0 (line 1) It takes several passes over the training corpus D = ((x1, y1), ..., (xN , yN )), where xi and yi are the i-th sentence and the gold-standard graphs that are automatically derived from the gold-standard event annotations using the Head-Word rule, respectively (line 2). In sentence (5), those graphs without any one of the event triggers of these two Positive Regulation events would violate the distance constraint with β ≤ 3

Background

Results and Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Making adjustments to event annotations for improved biological event extraction.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Biomedical Semantics

Lead the way for us

Journal: Journal of Biomedical Semantics	Publication Date: Sep 16, 2016
License type: cc-by

Similar Papers

Domain transformation on biological event extraction by learning methods.
Wen Juan Hou ... Bamfa Ceesay
Journal of Biomedical Informatics | VOL. 95
Wen Juan Hou, et. al.Wen Juan Hou ... Bamfa Ceesay
18 Jun 2019
Journal of Biomedical Informatics | VOL. 95

Biological event composition.
Halil Kilicoglu ... Sabine Bergler
BMC Bioinformatics | VOL. Suppl 13 11
Halil Kilicoglu, et. al.Halil Kilicoglu ... Sabine Bergler
01 Jun 2012
BMC Bioinformatics | VOL. Suppl 13 11

Biological Event Trigger Identification with Noise Contrastive Estimation.
Nan Jiang ... Wenge Rong
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 15
Nan Jiang, et. al.Nan Jiang ... Wenge Rong
01 Sep 2018
IEEE/ACM Transactions on Computational Biology and Bioinformatics | VOL. 15

Coreference resolution improves extraction of Biological Expression Language statements from texts.
Miji Choi ... Haibin Liu
Database : the journal of biological databases and curation | VOL. 2016
Miji Choi, et. al.Miji Choi ... Haibin Liu
01 Jan 2015
Database : the journal of biological databases and curation | VOL. 2016

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Making adjustments to event annotations for improved biological event extraction.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Biomedical Semantics