Abstract
Generation of entity coreference chains provides a means to extract linked narrative events from clinical notes, but despite being a well-researched topic in natural language processing, general-purpose coreference tools perform poorly on clinical texts. This paper presents a knowledge-centric and pattern-based approach to resolving coreference across a wide variety of clinical records from two corpora (Ontology Development and Information Extraction (ODIE) and i2b2/VA), and describes a method for generating coreference chains using progressively pruned linked lists that reduces the search space and facilitates evaluation by a number of metrics. Independent evaluation results give an F-measure for each corpus of 79.2% and 87.5%, respectively. A baseline of blind coreference of mentions of the same class gives F-measures of 65.3% and 51.9% respectively. For the ODIE corpus, recall is significantly improved over the baseline (p<0.05) but overall there was no statistically significant improvement in F-measure (p>0.05). For the i2b2/VA corpus, recall, precision, and F-measure are significantly improved over the baseline (p<0.05). Overall, our approach offers performance at least as good as human annotators and greatly increased performance over general-purpose tools. The system uses a number of open-source components that are available to download.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.