BioCause: Annotating and analysing causality in the biomedical domain

Claudiu Mihăilă,Tomoko Ohta,Sampo Pyysalo,Sophia Ananiadou

doi:10.1186/1471-2105-14-2

Claudiu Mihăilă, Tomoko Ohta + Show 2 more

Open Access

https://doi.org/10.1186/1471-2105-14-2

Copy DOI

Journal: BMC Bioinformatics	Publication Date: Jan 16, 2013
Citations: 111	License type: CC BY 2.0

Affiliation: University of Manchester

Abstract

BackgroundBiomedical corpora annotated with event-level information represent an important resource for domain-specific information extraction (IE) systems. However, bio-event annotation alone cannot cater for all the needs of biologists. Unlike work on relation and event extraction, most of which focusses on specific events and named entities, we aim to build a comprehensive resource, covering all statements of causal association present in discourse. Causality lies at the heart of biomedical knowledge, such as diagnosis, pathology or systems biology, and, thus, automatic causality recognition can greatly reduce the human workload by suggesting possible causal connections and aiding in the curation of pathway models. A biomedical text corpus annotated with such relations is, hence, crucial for developing and evaluating biomedical text mining.ResultsWe have defined an annotation scheme for enriching biomedical domain corpora with causality relations. This schema has subsequently been used to annotate 851 causal relations to form BioCause, a collection of 19 open-access full-text biomedical journal articles belonging to the subdomain of infectious diseases. These documents have been pre-annotated with named entity and event information in the context of previous shared tasks. We report an inter-annotator agreement rate of over 60% for triggers and of over 80% for arguments using an exact match constraint. These increase significantly using a relaxed match setting. Moreover, we analyse and describe the causality relations in BioCause from various points of view. This information can then be leveraged for the training of automatic causality detection systems.ConclusionAugmenting named entity and event annotations with information about causal discourse relations could benefit the development of more sophisticated IE systems. These will further influence the development of multiple tasks, such as enabling textual inference to detect entailments, discovering new facts and providing new hypotheses for experimental work.

Highlights

Biomedical corpora annotated with event-level information represent an important resource for domain-specific information extraction (IE) systems
We report on the inter-annotator agreement scores on the doubly annotated section of the corpus and investigate the disagreements between the two experts that were found in this part
The corpus contains a total of 851 causal relation annotations spread over 19 open-access biomedical journal articles regarding infectious diseases

Summary

Introduction

Biomedical corpora annotated with event-level information represent an important resource for domain-specific information extraction (IE) systems. Due to the ever-increasing number of innovations and discoveries in the biomedical domain, the amount of knowledge published daily in the form of research articles is growing exponentially. This has resulted in the need to provide automated, efficient and accurate means of retrieving and extracting user-oriented biomedical knowledge [1,2,3,4]. In response to this need, the biomedical text mining community has accelerated research and the development of tools. Others define various discourse zones and try to determine automatically to which zone a sentence belongs [37]

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

BioCause: Annotating and analysing causality in the biomedical domain

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics

Lead the way for us

Similar Papers

InfoXtract: A customizable intermediate level information extraction engine
Rohini K Srihari ... Cheng Niu
Natural Language Engineering | VOL. 14
Rohini K Srihari, et. al.Rohini K Srihari ... Cheng Niu
09 Jun 2006
Natural Language Engineering | VOL. 14

Inducing information extraction systems for new languages via cross-language projection
Ellen Riloff ... Charles Schafer
-
Ellen Riloff, et. al.Ellen Riloff ... Charles Schafer
01 Jan 2002
01 Jan 2002

Use of a Fast Information Extraction Method as a Decision Support Tool
Mahmudul Sheikh ... Sumali Conlon
Journal of International Technology and Information Management | VOL. 19
Mahmudul Sheikh, et. al.Mahmudul Sheikh ... Sumali Conlon
01 Jan 2009
Journal of International Technology and Information Management | VOL. 19

Join Optimization of Information Extraction Output: Quality Matters!
Alpa Jain ... Anhai Doan
-
Alpa Jain, et. al.Alpa Jain ... Anhai Doan
01 Mar 2009
01 Mar 2009

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

BioCause: Annotating and analysing causality in the biomedical domain

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Bioinformatics