A Case Demonstration of the Open Health Natural Language Processing Toolkit From the National COVID-19 Cohort Collaborative and the Researching COVID to Enhance Recovery Programs for a Natural Language Processing System for COVID-19 or Postacute Sequelae of SARS CoV-2 Infection: Algorithm Development and Validation.

Andrew Wen,Corey Elowsky,Karthik Natarajan,Daniel R Harris,Melissa A Haendel,Huan He,Robert T Miller,Rui Zhang,Christopher G Chute,Mary Saltz,Liwei Wang,Peter J Leese,Janos Hajagos,Matvey B Palchuk,Andrew E Williams,Jordan Donovan,Mikhail Zemmel,Ramakanth Kavuluru,Nick Guthe,Emily R Pfaff,Garo Stone-Derhargopian,Richard A Moffitt,Sritha Rajupet,Sunyang Fu,Hongfang Liu,Farrukh M Koraishy,Nishanth P Pavinkurve,Robert D Pates,Veena Lingam,Sijia Liu,The Recover Initiative ,Lora Lingrey,National Covid Cohort Collaborative ,David A Hanauer,Paul I Kovach

doi:10.2196/49997

Andrew Wen, Corey Elowsky + Show 33 more

Open Access

https://doi.org/10.2196/49997

Copy DOI

Export

Save

Cite

Journal: JMIR medical informatics	Publication Date: Sep 9, 2024
Citations: 1	License type: cc-by

Abstract
Full-Text
Similar Papers

Abstract

Listen

A wealth of clinically relevant information is only obtainable within unstructured clinical narratives, leading to great interest in clinical natural language processing (NLP). While a multitude of approaches to NLP exist, current algorithm development approaches have limitations that can slow the development process. These limitations are exacerbated when the task is emergent, as is the case currently for NLP extraction of signs and symptoms of COVID-19 and postacute sequelae of SARS-CoV-2 infection (PASC). This study aims to highlight the current limitations of existing NLP algorithm development approaches that are exacerbated by NLP tasks surrounding emergent clinical concepts and to illustrate our approach to addressing these issues through the use case of developing an NLP system for the signs and symptoms of COVID-19 and PASC. We used 2 preexisting studies on PASC as a baseline to determine a set of concepts that should be extracted by NLP. This concept list was then used in conjunction with the Unified Medical Language System to autonomously generate an expanded lexicon to weakly annotate a training set, which was then reviewed by a human expert to generate a fine-tuned NLP algorithm. The annotations from a fully human-annotated test set were then compared with NLP results from the fine-tuned algorithm. The NLP algorithm was then deployed to 10 additional sites that were also running our NLP infrastructure. Of these 10 sites, 5 were used to conduct a federated evaluation of the NLP algorithm. An NLP algorithm consisting of 12,234 unique normalized text strings corresponding to 2366 unique concepts was developed to extract COVID-19 or PASC signs and symptoms. An unweighted mean dictionary coverage of 77.8% was found for the 5 sites. The evolutionary and time-critical nature of the PASC NLP task significantly complicates existing approaches to NLP algorithm development. In this work, we present a hybrid approach using the Open Health Natural Language Processing Toolkit aimed at addressing these needs with a dictionary-based weak labeling step that minimizes the need for additional expert annotation while still preserving the fine-tuning capabilities of expert involvement.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Abstract

Published Version

Talk to us

Similar Papers

More From: JMIR medical informatics

Lead the way for us

Similar Papers

Natural language processing algorithms for mapping clinical text fragments onto ontology concepts: a systematic review and recommendations for future studies
Martijn G Kersloot ... Derk L Arts
Journal of biomedical semantics | VOL. 11
Martijn G Kersloot, et. al.Martijn G Kersloot ... Derk L Arts
16 Nov 2020
Journal of biomedical semantics | VOL. 11

Identification of recurrent atrial fibrillation using natural language processing applied to electronic health records.
Chengyi Zheng ... Amanda Allen
European Heart Journal - Quality of Care and Clinical Outcomes | VOL. 10
Chengyi Zheng, et. al.Chengyi Zheng ... Amanda Allen
30 Mar 2023
European Heart Journal - Quality of Care and Clinical Outcomes | VOL. 10

Natural language processing of radiology reports for identification of skeletal site-specific fractures
Yanshan Wang ... Sunghwan Sohn
BMC Medical Informatics and Decision Making | VOL. 19
Yanshan Wang, et. al.Yanshan Wang ... Sunghwan Sohn
01 Apr 2019
BMC Medical Informatics and Decision Making | VOL. 19

The use of natural language processing to identify vaccine-related anaphylaxis at five health care systems in the Vaccine Safety Datalink.
Wei Yu ... Fagen Xie
Pharmacoepidemiology and Drug Safety | VOL. 29
Wei Yu, et. al.Wei Yu ... Fagen Xie
03 Dec 2019
Pharmacoepidemiology and Drug Safety | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Abstract

Published Version

Talk to us

Similar Papers

More From: JMIR medical informatics