Abstract

A dependency parser generates both a syntactic structure and a shallow semantic structure of a sentence. It is a fundamental component of natural language processing (NLP) based pipelines, which are critical to facilitate research using the Electronic Health Records (EHR). However, current works mainly apply parsers developed in the general English domain to clinical text. There are no formal evaluations and comparisons of deep learning based dependency parsers in the medical domain. No state-of-the-art dependency parsing performance has been established on clinical text, either. In this study, we investigated the performance of four state-ofthe-art deep learning based dependency parsers, Stanford parser, Bist-parser, dependency_tf parser and jPTDP parser, respectively. Experiments for evaluation are conducted on two datasets: (1) The MiPACQ Treebank and (2) A Treebank of progress notes. Our results showed that the original parsers achieved lower performance in clinical text compared to general English text. After retraining on the clinical Treebank, all parsers obtained better performance. Besides, using word embeddings from Gigaword and MIMICIII yielded comparable performance. Interestingly, the transition-based parsers demonstrated stronger generalizability on different treebanks than the graph-based parsers. Overall, Bist-parser achieved the best performance on MiPACQ (88.95% UAS, 92.69% LS, 86.10% LAS). Stanford parser achieved the best performance on progress notes (84.01% UAS, 89/97% LS, 80.72% LAS).

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.