Impact Analysis of De-Identification in Clinical Notes Classification.

Martin Baumgartner,Günter Schreier,Lukas Haider,Karl Kreiner,Gerhard Pölzl,Fabian Wiesmüller,Luca Brunelli,Dieter Hayn

doi:10.3233/shti220368

Abstract

Clinical notes provide valuable data in telemonitoring systems for disease management. Such data must be converted into structured information to be effective in automated analysis. One way to achieve this is by classification (e.g. into categories). However, to conform with privacy regulations and concerns, text is usually de-identified. This study investigated the effects of de-identification on classification. Two pseudonymisation and two classification algorithms were applied to clinical messages from a telehealth system. Divergence in classification compared to clear text classification was measured. Overall, de-identification notably altered classification. The delicate classification algorithm was severely impacted, especially losses of sensitivity were noticeable. However, the simpler classification method was more robust and in combination with a more yielding pseudonymisation technique, had only a negligible impact on classification. The results indicate that de-identification can impact text classification and suggest, that considering de-identification during development of the classification methods could be beneficial.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Studies in health technology and informatics	Publication Date: May 16, 2022
Citations: 2	License type: CC BY-NC 4.0

R Discovery Prime

R Discovery Prime

Impact Analysis of De-Identification in Clinical Notes Classification.

Abstract

Talk to us

Similar Papers

More From: Studies in health technology and informatics

Lead the way for us

Similar Papers

Identifying Diabetes in Clinical Notes in Hebrew: A Novel Text Classification Approach Based on Word Embedding.
Maxim Topaz ... Nadav Furie
Studies in health technology and informatics | VOL. 264
Maxim Topaz, et. al.Maxim Topaz ... Nadav Furie
21 Aug 2019
Studies in health technology and informatics | VOL. 264

Designing Explainable Text Classification Pipelines: Insights from IT Ticket Complexity Prediction Case Study
Aleksandra Revina ... Krisztian Buza
-
Aleksandra Revina, et. al.Aleksandra Revina ... Krisztian Buza
01 Jan 2020
01 Jan 2020

Text classification method based on self-training and LDA topic models
Miha Pavlinek ... Vili Podgorelec
Expert Systems with Applications | VOL. 80
Miha Pavlinek, et. al.Miha Pavlinek ... Vili Podgorelec
08 Mar 2017
Expert Systems with Applications | VOL. 80

A unified view on association and classification mining and its applications
Dimitrios Meretakis
-
Dimitrios MeretakisDimitrios Meretakis
23 Dec 2014
23 Dec 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Impact Analysis of De-Identification in Clinical Notes Classification.

Abstract

Talk to us

Similar Papers

More From: Studies in health technology and informatics