Abstract

Dividing biomedical abstracts into several segments with rhetorical roles is essential for supporting researchers’ information access in the biomedical domain. Conventional methods have regarded the task as a sequence labeling task based on sequential sentence classification, i.e., they assign a rhetorical label to each sentence by considering the context in the abstract. However, these methods have a critical problem: they are prone to mislabel longer continuous sentences with the same rhetorical label. To tackle the problem, we propose sequential span classification that assigns a rhetorical label, not to a single sentence but to a span that consists of continuous sentences. Accordingly, we introduce Neural Semi-Markov Conditional Random Fields to assign the labels to such spans by considering all possible spans of various lengths. Experimental results obtained from PubMed 20k RCT and NICTA-PIBOSO datasets demonstrate that our proposed method achieved the best micro sentence-F1 score as well as the best micro span-F1 score.

Highlights

  • Dividing documents into several rhetorical segments is a fundamental task in natural language processing (NLP)

  • Most previous methods in PubMed have regarded the task as a sequence labeling, namely sequential sentence classification, that assigns rhetorical labels with a B(egin)/I(nside) tag set to each sentence while considering the context in the abstract

  • We introduce Neural Semi-Markov Conditional Random Fields (SCRFs)

Read more

Summary

Introduction

Dividing documents into several rhetorical segments is a fundamental task in natural language processing (NLP). Most previous methods in PubMed have regarded the task as a sequence labeling, namely sequential sentence classification, that assigns rhetorical labels with a B(egin)/I(nside) tag set to each sentence while considering the context in the abstract. To this end, some statistical methods with hand-engineered features have been proposed, including Hidden Markov Models (HMMs) (Lin et al, 2006) and Conditional Random Fields (CRFs) (Hirohata et al, 2008; Kim et al, 2011; Hassanzadeh et al, 2014). To optimize response to postal questionnaires among cancer survivors, researchers might consider inclusion of a lottery scratch card

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.