De-identification of psychiatric intake records: Overview of 2016 CEGS N-GRID shared tasks Track 1

Amber Stubbs,Michele Filannino,Özlem Uzuner

doi:10.1016/j.jbi.2017.06.011

Abstract

The 2016 CEGS N-GRID shared tasks for clinical records contained three tracks. Track 1 focused on de-identification of a new corpus of 1000 psychiatric intake records. This track tackled de-identification in two sub-tracks: Track 1.A was a “sight unseen” task, where nine teams ran existing de-identification systems, without any modifications or training, on 600 new records in order to gauge how well systems generalize to new data. The best-performing system for this track scored an F1 of 0.799. Track 1.B was a traditional Natural Language Processing (NLP) shared task on de-identification, where 15 teams had two months to train their systems on the new data, then test it on an unannotated test set. The best-performing system from this track scored an F1 of 0.914. The scores for Track 1.A show that unmodified existing systems do not generalize well to new data without the benefit of training data. The scores for Track 1.B are slightly lower than the 2014 de-identification shared task (which was almost identical to 2016 Track 1.B), indicating that these new psychiatric records pose a more difficult challenge to NLP systems. Overall, de-identification is still not a solved problem, though it is important to the future of clinical NLP.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Biomedical Informatics	Publication Date: Jun 11, 2017
Citations: 89	License type: publisher-specific-oa

R Discovery Prime

R Discovery Prime

De-identification of psychiatric intake records: Overview of 2016 CEGS N-GRID shared tasks Track 1

Abstract

Talk to us

Similar Papers

More From: Journal of Biomedical Informatics

Lead the way for us

Similar Papers

#2924 Comparison of large language models and traditional natural language processing techniques in predicting arteriovenous fistula failure
Suman Lama ... Luca Neri
Nephrology Dialysis Transplantation | VOL. 39
Suman Lama, et. al.Suman Lama ... Luca Neri
23 May 2024
Nephrology Dialysis Transplantation | VOL. 39

Sentiment Analysis of Social Media Reviews using QOS Parameterization
Jaspreet Singh ... Gurvinder Singh
-
Jaspreet Singh, et. al.Jaspreet Singh ... Gurvinder Singh
01 Dec 2018
01 Dec 2018

Comprehension of Contextual Semantics Across Clinical Healthcare Domains
Kurt Miller
-
Kurt MillerKurt Miller
01 Jun 2022
01 Jun 2022

Artificial Intelligence Learning Semantics via External Resources for Classifying Diagnosis Codes in Discharge Notes.
Chin Lin ... Sui-Lung Su
Journal of Medical Internet Research | VOL. 19
Chin Lin, et. al.Chin Lin ... Sui-Lung Su
06 Nov 2017
Journal of Medical Internet Research | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

De-identification of psychiatric intake records: Overview of 2016 CEGS N-GRID shared tasks Track 1

Abstract

Talk to us

Similar Papers

More From: Journal of Biomedical Informatics