Abstract

Receiving timely and appropriate treatment is crucial for better health outcomes, and research on the contribution of specific variables is essential. In the mental health domain, an important research variable is the date of psychosis symptom onset, as longer delays in treatment are associated with worse intervention outcomes. The growing adoption of electronic health records (EHRs) within mental health services provides an invaluable opportunity to study this problem at scale retrospectively. However, disease onset information is often only available in open text fields, requiring natural language processing (NLP) techniques for automated analyses. Since this variable can be documented at different points during a patient’s care, NLP methods that model clinical and temporal associations are needed. We address the identification of psychosis onset by: 1) manually annotating a corpus of mental health EHRs with disease onset mentions, 2) modelling the underlying NLP problem as a paragraph classification approach, and 3) combining multiple onset paragraphs at the patient level to generate a ranked list of likely disease onset dates. For 22/31 test patients (71%) the correct onset date was found among the top-3 NLP predictions. The proposed approach was also applied at scale, allowing an onset date to be estimated for 2483 patients.

Highlights

  • Receiving timely and appropriate treatment is crucial for better health outcomes, and research on the contribution of specific variables is essential

  • Research findings have had a considerable impact on mental health services in the United Kingdom, leading to the establishment of early intervention teams aimed at reducing duration of untreated psychosis (DUP) by supporting people with a first episode of psychosis (FEP)

  • To build a corpus for natural language processing (NLP) development, we extracted electronic health records (EHRs) documents for patients who had been diagnosed with schizophrenia, considering all patients referred after 1st January 2012 and looking at a 3-month window after the first referral date

Read more

Summary

Introduction

Receiving timely and appropriate treatment is crucial for better health outcomes, and research on the contribution of specific variables is essential. In the mental health domain, an important research variable is the date of psychosis symptom onset, as longer delays in treatment are associated with worse intervention outcomes. Disease onset information is often only available in open text fields, requiring natural language processing (NLP) techniques for automated analyses Since this variable can be documented at different points during a patient’s care, NLP methods that model clinical and temporal associations are needed. In mental health EHRs, compared to other medical domains, clinically relevant information is predominantly documented in free text rather than in structured ­fields[10] Unlocking this information for patient-level research can be challenging, in that there is large variability in how concepts are defined and documented, potentially requiring different layers of information and relation extraction. Of interest, understanding the context of the documentation procedures and the underlying EHR data is essential when designing an appropriate NLP approach

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call