Agile text mining for the 2014 i2b2/UTHealth Cardiac risk factors challenge

James Cormack,Chinmoy Nath,David Milward,Kalpana Raja,Siddhartha R Jonnalagadda

doi:10.1016/j.jbi.2015.06.030

Agile text mining for the 2014 i2b2/UTHealth Cardiac risk factors challenge

James Cormack, Chinmoy Nath + Show 3 more

Open Access

https://doi.org/10.1016/j.jbi.2015.06.030

Copy DOI

Journal: Journal of Biomedical Informatics	Publication Date: Jul 22, 2015
Citations: 39	License type: cc-by-nc-nd

Affiliation: Linguamatics (United Kingdom), Northwestern University

#Gold Standard Data #Corpus Statistics + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

This paper describes the use of an agile text mining platform (Linguamatics’ Interactive Information Extraction Platform, I2E) to extract document-level cardiac risk factors in patient records as defined in the i2b2/UTHealth 2014 challenge. The approach uses a data-driven rule-based methodology with the addition of a simple supervised classifier. We demonstrate that agile text mining allows for rapid optimization of extraction strategies, while post-processing can leverage annotation guidelines, corpus statistics and logic inferred from the gold standard data. We also show how data imbalance in a training set affects performance. Evaluation of this approach on the test data gave an F-Score of 91.7%, one percent behind the top performing system.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Journal of Biomedical Informatics

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.