Abstract

BackgroundA shareable repository of clinical notes is critical for advancing natural language processing (NLP) research, and therefore a goal of many NLP researchers is to create a shareable repository of clinical notes, that has breadth (from multiple institutions) as well as depth (as much individual data as possible).MethodsWe aimed to assess the degree to which individuals would be willing to contribute their health data to such a repository. A compact e-survey probed willingness to share demographic and clinical data categories. Participants were faculty, staff, and students in two geographically diverse major medical centers (Utah and New York). Such a sample could be expected to respond like a typical potential participant from the general public who is given complete and fully informed consent about the pros and cons of participating in a research study.ResultsTwo thousand one hundred forty respondents completed the surveys. 56% of respondents were “somewhat/definitely willing” to share clinical data with identifiers, while 89% of respondents were “somewhat (17%)/definitely willing (72%)” to share without identifiers. Results were consistent across gender, age, and education, but there were some differences by geographical region. Individuals were most reluctant (50–74%) sharing mental health, substance abuse, and domestic violence data.ConclusionsWe conclude that a substantial fraction of potential patient participants, once educated about risks and benefits, would be willing to donate de-identified clinical data to a shared research repository. A slight majority even would be willing to share absent de-identification, suggesting that perceptions about data misuse are not a major concern. Such a repository of clinical notes should be invaluable for clinical NLP research and advancement.

Highlights

  • A shareable repository of clinical notes is critical for advancing natural language processing (NLP) research, and a goal of many NLP researchers is to create a shareable repository of clinical notes, that has breadth as well as depth

  • Reproducibility of Natural Language Processing (NLP) methods and comparison of results is the cornerstone of biomedical NLP research, but this requires that patient data including clinical textual notes be made shareable

  • A goal of many NLP researchers is to create a shareable repository of clinical notes that has

Read more

Summary

Introduction

A shareable repository of clinical notes is critical for advancing natural language processing (NLP) research, and a goal of many NLP researchers is to create a shareable repository of clinical notes, that has breadth (from multiple institutions) as well as depth (as much individual data as possible). Reproducibility of Natural Language Processing (NLP) methods and comparison of results is the cornerstone of biomedical NLP research, but this requires that patient data including clinical textual notes be made shareable. Weng et al BMC Medical Informatics and Decision Making 2019, 19(Suppl 3): health information (PHI) as possible and seek local Institutional Review Board (IRB) approval to share the narratives under a waiver of patient consent and authorization. The distribution of such a corpus is always managed under a data sharing agreement. In light of that kind of performance the University of Utah and Columbia University IRBs do not sanction the release of de-identified clinical notes under waivers of patient consent and authorization

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call