Annotating German Clinical Documents for De-Identification.

Tobias Kolditz,Boris Betz,Christina Lohr,Luise Modersohn,Johannes Hellrich,Michael Kiehntopf,Udo Hahn

doi:10.3233/shti190212

Annotating German Clinical Documents for De-Identification.

Tobias Kolditz, Boris Betz + Show 5 more

https://doi.org/10.3233/shti190212

Copy DOI

Journal: Studies in health technology and informatics	Publication Date: Aug 22, 2019
Citations: 9

Affiliation: Friedrich Schiller University Jena, Jena University Hospital

#Protected Health Information #De-identification + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

We devised annotation guidelines for the de-identification of German clinical documents and assembled a corpus of 1,106 discharge summaries and transfer letters with 44K annotated protected health information (PHI) items. After three iteration rounds, our annotation team finally reached an inter-annotator agreement of 0.96 on the instance level and 0.97 on the token level of annotation (averaged pair-wise F1 score). To establish a baseline for automatic de-identification on our corpus, we trained a recurrent neural network (RNN) and achieved F1 scores greater than 0.9 on most major PHI categories.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Studies in health technology and informatics

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.