Abstract

Automatic identification of clinical concepts in electronic medical records (EMR) is useful not only in forming a complete longitudinal health record of patients, but also in recovering missing codes for billing, reducing costs, finding more accurate clinical cohorts for clinical trials, and enabling better clinical decision support. Existing systems for clinical concept extraction are mostly knowledge-driven, relying on exact match retrieval from original or lemmatized reports, and very few of them are scaled up to handle large volumes of complex, diverse data. In this demonstration we will showcase a new system for real-time detection of clinical concepts in EMR. The system features a large vocabulary of over 5.6 million concepts. It achieves high precision and recall, with good tolerance to typos through the use of a novel prefix indexing and subsequence matching algorithm, along with a recursive negation detector based on efficient, deep parsing. Our system has been tested on over 12.9 million reports of more than 200 different types, collected from 800,000+ patients. A comparison with the state of the art shows that it outperforms previous systems in addition to being the first system to scale to such large collections.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.