Abstract

Information about bacteria biotopes (BB) is important for fundamental research and applications in microbiology. BB task at BioNLP-OST 2019 focuses on the extraction of locations and phenotypes of microorganisms from PubMed abstracts and full-text excerpts. The subtask BB-rel+ner aims to recognize relevant entities and extract interrelationships about BBs. The corresponding corpus owns some distinctive features (e.g. nested entities) which are challenging to deal with. Therefore, previous methods achieved low performance on entity and relation extraction and limited the mutual effect between named entity recognition and relation extraction. There is still much room for improvement. We propose a span-based model to extract entities and relations jointly from biomedical text regarding the BBs. For alleviating the problem of annotated data deficiency in domain-specific task, we employ a BERT (Bidirectional Encoder Representations from Transformers) model pre-trained on the domain-specific corpus to encode sentences. Our model considers all spans in a sentence as potential entity mentions and computes relation scores between the most confident entity spans based on representations of spans and contexts between spans. Experiments on the BB-rel+ner 2019 corpus demonstrate that our model achieves significantly better performance than the state-of-the-art method, with a reduction of 21.6% slot error rate (SER) for extracting relations. Our model is also effective in recognizing nested entities. Furthermore, the model can be applied to the CHEMPROT corpus for joint extraction of chemical-protein entities and relations, achieving state-of-the-art performance. Our source code is available at https://github.com/zmmzGitHub/SpanMB_BERT. Supplementary data are available at Bioinformatics online.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.