Abstract

Automatic extraction of clinical named entities, such as body parts, drugs and surgeries, has been of great significance to understand clinical texts. Deep neural networks approaches have achieved remarkable success in named entity recognition task recently. However, most of these approaches train models from large, high-quality and labor-consuming labeled data. In order to reduce the labeling costs, we propose a weakly supervised learning method for clinical named entity recognition (CNER) tasks. We use a small amount of labeled data as seed corpus, and propose a bootstrapping method integrating external knowledge to iteratively generate the labels for unlabeled data. The external knowledge consists of domain specific dictionaries as well as a bunch of handcraft rules. We conduct experiments on CCKS-2018 CNER task dataset and our approach achieves competitive results comparing to the supervised approach with fully labeled data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call