Parsing clinical text: how good are the state-of-the-art parsers?

Min Jiang,Jung-Wei Fan,Hua Xu,Buzhou Tang,Josh Denny,Yang Huang

doi:10.1186/1472-6947-15-s1-s2

Min Jiang, Jung-Wei Fan + Show 4 more

Open Access

PDF Available

https://doi.org/10.1186/1472-6947-15-s1-s2

Copy DOI

Export

Save

Cite

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

BackgroundParsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain.MethodsIn this study, we investigated the performance of three state-of-the-art parsers: the Stanford parser, the Bikel parser, and the Charniak parser, using following two datasets: (1) A Treebank containing 1,100 sentences that were randomly selected from progress notes used in the 2010 i2b2 NLP challenge and manually annotated according to a Penn Treebank based guideline; and (2) the MiPACQ Treebank, which is developed based on pathology notes and clinical notes, containing 13,091 sentences. We conducted three experiments on both datasets. First, we measured the performance of the three state-of-the-art parsers on the clinical Treebanks with their default settings. Then we re-trained the parsers using the clinical Treebanks and evaluated their performance using the 10-fold cross validation method. Finally we re-trained the parsers by combining the clinical Treebanks with the Penn Treebank.ResultsOur results showed that the original parsers achieved lower performance in clinical text (Bracketing F-measure in the range of 66.6%-70.3%) compared to general English text. After retraining on the clinical Treebank, all parsers achieved better performance, with the best performance from the Stanford parser that reached the highest Bracketing F-measure of 73.68% on progress notes and 83.72% on the MiPACQ corpus using 10-fold cross validation. When the combined clinical Treebanks and Penn Treebank was used, of the three parsers, the Charniak parser achieved the highest Bracketing F-measure of 73.53% on progress notes and the Stanford parser reached the highest F-measure of 84.15% on the MiPACQ corpus.ConclusionsOur study demonstrates that re-training using clinical Treebanks is critical for improving general English parsers' performance on clinical text, and combining clinical and open domain corpora might achieve optimal performance for parsing clinical text.

Highlights

Parsing is the process of assigning syntactic structures to input strings according to grammar
Our study demonstrates that re-training using clinical Treebanks is critical for improving general English parsers’ performance on clinical text, and combining clinical and open domain corpora might achieve optimal performance for parsing clinical text
We evaluated the performance of three state-of-the-art parsers: the Stanford parser [4], the Bikel parser [5] and the Charniak parser [6], using two clinical Treebanks including the Treebank of progress notes reported in Fan et al [24] and the MiPACQ Treebank

Summary

Introduction

Parsing is the process of assigning syntactic structures to input strings according to grammar. He extended his probabilistic parser developed in 1996 with three generative models to calculate all the probabilities of the parse tree head nodes including adjunct/complement distinction and wh-movement Evaluation showed that these models surpassed Megerman’s and his previous parsers and achieved a F-measure of 87.8%. Clegg and Shepherd [10] developed an evaluation method by using dependency graphs as an intermediate representation wherein they compared four parsers: the Collins parser [3], the Bikel parser [5], the Stanford parser [4], and the Charniak-Lease parser [6], on the GENIA corpus Their results showed that the Bikel and Charniak-Lease parsers achieved better performance than the others; but the overall performance of all the parsers dropped when compared with results from the Penn Treebank. Parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain

Methods

Results

Discussion

Conclusion

Full Text

Published Version (Free)

View/Download pdf

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Medical Informatics and Decision Making	Publication Date: May 20, 2015
Citations: 13	License type: cc-by

R Discovery Prime

Parsing clinical text: how good are the state-of-the-art parsers?

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making

Lead the way for us

Similar Papers

Parsing Clinical Text: How Good are the state-of-the-art Deep Learning Based Parsers?
Yaoyun Zhang ... Firat Tiryaki
-
Yaoyun Zhang, et. al.Yaoyun Zhang ... Firat Tiryaki
01 Jun 2018
01 Jun 2018

An initial study of full parsing of clinical text using the Stanford Parser
Hua Xu ... S Abdelrahman
-
Hua Xu, et. al. Hua Xu ... S Abdelrahman
01 Nov 2011
01 Nov 2011

The 2019 n2c2/OHNLP Track on Clinical Semantic Textual Similarity: Overview.
Yanshan Wang ... Sunyang Fu
JMIR medical informatics | VOL. 8
Yanshan Wang, et. al.Yanshan Wang ... Sunyang Fu
27 Nov 2020
The 2019 n2c2/OHNLP Track on Clinical Semantic Textual Similarity: Overview.
Yanshan Wang ... Sunyang Fu

Measurement of Semantic Textual Similarity in Clinical Texts: Comparison of Transformer-Based Models.
Xi Yang ... Hansi Zhang
JMIR Medical Informatics | VOL. 8
Xi Yang, et. al.Xi Yang ... Hansi Zhang
23 Nov 2020
JMIR Medical Informatics | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Parsing clinical text: how good are the state-of-the-art parsers?

Abstract

Highlights

Summary

Published Version (Free)

Talk to us

Similar Papers

More From: BMC Medical Informatics and Decision Making