Abstract
Automatic extraction of biological network information is one of the most desired and most complex tasks in biological and medical text mining. Track 4 at BioCreative V attempts to approach this complexity using fragments of large-scale manually curated biological networks, represented in Biological Expression Language (BEL), as training and test data. BEL is an advanced knowledge representation format which has been designed to be both human readable and machine processable. The specific goal of track 4 was to evaluate text mining systems capable of automatically constructing BEL statements from given evidence text, and of retrieving evidence text for given BEL statements. Given the complexity of the task, we designed an evaluation methodology which gives credit to partially correct statements. We identified various levels of information expressed by BEL statements, such as entities, functions, relations, and introduced an evaluation framework which rewards systems capable of delivering useful BEL fragments at each of these levels. The aim of this evaluation method is to help identify the characteristics of the systems which, if combined, would be most useful for achieving the overall goal of automatically constructing causal biological networks from text.
Highlights
Biological networks with a structured syntax are a powerful way of representing biological information and knowledge
We provided training and test corpora selected from the biological networks manually curated in the Network Verification Challenge (NVC), assuring high quality of the data [11]
The Biological Expression Language (BEL) track at BioCreative 2015 offered a novel platform for the evaluation of text mining systems capable of dealing with BEL statements
Summary
Biological networks with a structured syntax are a powerful way of representing biological information and knowledge. Well-known examples of standards to formally represent biological networks are the Systems Biology Markup Language (SBML) [1], the Biological pathway exchange language (BioPAX) [2] and the Biological. Expression Language (http://www.openbel.org/) (BEL) [3]. These approaches are designed for the representation of biological events, but they are intended to support downstream computational applications. BEL is gaining ground as the de-facto standard for systems biology applications because it combines the power of a formalized representation language with a VC The Author(s) 2016.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have