Hospital-wide natural language processing summarising the health data of 1 million patients.

Daniel M Bean,James Teo,Anthony Shek,Richard J B Dobson,Zeljko Kraljevic

doi:10.1371/journal.pdig.0000218

Abstract

Electronic health records (EHRs) represent a major repository of real world clinical trajectories, interventions and outcomes. While modern enterprise EHR's try to capture data in structured standardised formats, a significant bulk of the available information captured in the EHR is still recorded only in unstructured text format and can only be transformed into structured codes by manual processes. Recently, Natural Language Processing (NLP) algorithms have reached a level of performance suitable for large scale and accurate information extraction from clinical text. Here we describe the application of open-source named-entity-recognition and linkage (NER+L) methods (CogStack, MedCAT) to the entire text content of a large UK hospital trust (King's College Hospital, London). The resulting dataset contains 157M SNOMED concepts generated from 9.5M documents for 1.07M patients over a period of 9 years. We present a summary of prevalence and disease onset as well as a patient embedding that captures major comorbidity patterns at scale. NLP has the potential to transform the health data lifecycle, through large-scale automation of a traditionally manual task.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS Digital Health	Publication Date: May 9, 2023
Citations: 8	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Hospital-wide natural language processing summarising the health data of 1 million patients.

Abstract

Talk to us

Similar Papers

More From: PLOS Digital Health

Lead the way for us

Similar Papers

Support for the Anaesthetists' Academic Foundation
...
Anaesthesia | VOL. 34
, et. al. ...
01 Jun 1979
Anaesthesia | VOL. 34

Giant Neutrophil Leucocytes: An Inherited Anomaly
William M Davidson ... R D G Milner
British Journal of Haematology | VOL. 6
William M Davidson, et. al.William M Davidson ... R D G Milner
01 Oct 1960
British Journal of Haematology | VOL. 6

Retention of the second twin: a viable option? Case reports
M G Long ... L D Cardozo
BJOG: An International Journal of Obstetrics & Gynaecology | VOL. 98
M G Long, et. al.M G Long ... L D Cardozo
01 Dec 1991
BJOG: An International Journal of Obstetrics & Gynaecology | VOL. 98

Ovarian hyperstimulation syndrome and deep cerebral venous thrombosis.
John J Waterstone ... John H Parsons
British journal of obstetrics and gynaecology | VOL. 99
John J Waterstone, et. al.John J Waterstone ... John H Parsons
01 May 1992
British journal of obstetrics and gynaecology | VOL. 99

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hospital-wide natural language processing summarising the health data of 1 million patients.

Abstract

Talk to us

Similar Papers

More From: PLOS Digital Health