Abstract

Assigning appropriate rhetorical roles, such as "background," "intervention," and "outcome," to sentences in biomedical documents can streamline the process for physicians to locate evidence and resources for medical treatment and decision-making. While sequence labeling and span-based methods are frequently employed for this task, the former disregards a document's semantic structure, resulting in a lack of semantic coherence across continuous sentences. Span-based approaches, on the other hand, either necessitate the enumeration of all potential spans, which can be time-consuming, or may lead to the misclassification of sentences over extended spans. Consequently, an approach is required that models the semantic structure of documents explicitly and captures boundary information to achieve precise and effective sentence labeling in biomedical documents. To address these challenges, we propose a new approach, the boundary-aware dual biaffine model, which explicitly models the semantic structure of documents and incorporates boundary information via a dual biaffine layer. We introduce a dynamic programming algorithm to minimize missing labels and overlapping predictions, and achieve globally optimal decoding results. We evaluate our approach on three benchmark datasets, namely PubMed 20k RCT, PubMed-PICO and NICTA-PIBOSO. The experimental results demonstrate that our approach outperforms strong baselines and achieves state-of-the-art performance on PubMed 20k RCT and PubMed-PICO. Additionally, our method also achieves competitive results on NICTA-PIBOSO. Availability: Our codes and data will be available at: https://github.com/CSU-NLP-Group/Sequential-Sentence-Classification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call