Abstract
BackgroundDe-identification is the first step to use these records for data processing or further medical investigations in electronic medical records. Consequently, a reliable automated de-identification system would be of high value.MethodsIn this paper, a method of combining text skeleton and recurrent neural network is proposed to solve the problem of de-identification. Text skeleton is the general structure of a medical record, which can help neural networks to learn better.ResultsWe evaluated our method on three datasets involving two English datasets from i2b2 de-identification challenge and a Chinese dataset we annotated. Empirical results show that the text skeleton based method we proposed can help the network to recognize protected health information.ConclusionsThe comparison between our method and state-of-the-art frameworks indicates that our method achieves high performance on the problem of medical record de-identification.
Highlights
De-identification is the first step to use these records for data processing or further medical investigations in electronic medical records
Dropout: 0.5 recurrent neural network (RNN) architecture: Gated Recurrent Unit (GRU) Hidden dimension: 150 Embedding dimension: 150 Early-stopping epoch: 8 Window size: 7 r: 0.25
It is significant to evaluate at binary token-level (PHI token versus nonPHI token)
Summary
De-identification is the first step to use these records for data processing or further medical investigations in electronic medical records. Electronic Medical Records (EMRs), due to the large amount of information they contain, are valuable resources worth studying. Because of the large number of Protected Health Information (PHI) existing in EMR, it is difficult for researchers or organizations to obtain these records. Dorr et al [1] have evaluated the time cost to manually de-identify narrative text notes (87.2 ± 61 s per note). They concluded that the problem of de-identification was
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.