Abstract
Cohort selection is an essential prerequisite for clinical research, determining whether an individual satisfies given selection criteria. Previous works for cohort selection usually treated each selection criterion independently and ignored not only the meaning of each selection criterion but the relations among cohort selection criteria. To solve the problems above, we propose a novel unified machine reading comprehension (MRC) framework. In this MRC framework, we design simple rules to generate questions for each criterion from cohort selection guidelines and treat clues extracted by trigger words from patients' medical records as passages. A series of state-of-the-art MRC models based on BiDAF, BIMPM, BERT, BioBERT, NCBI-BERT, and RoBERTa are deployed to determine which question and passage pairs match. We also introduce a cross-criterion attention mechanism on representations of question and passage pairs to model relations among cohort selection criteria. Results on two datasets, that is, the dataset of the 2018 National NLP Clinical Challenge (N2C2) for cohort selection and a dataset from the MIMIC-III dataset, show that our NCBI-BERT MRC model with cross-criterion attention mechanism achieves the highest micro-averaged F1-score of 0.9070 on the N2C2 dataset and 0.8353 on the MIMIC-III dataset. It is competitive to the best system that relies on a large number of rules defined by medical experts on the N2C2 dataset. Comparing these two models, we find that the NCBI-BERT MRC model mainly performs worse on mathematical logic criteria. When using rules instead of the NCBI-BERT MRC model on some criteria regarding mathematical logic on the N2C2 dataset, we obtain a new benchmark with an F1-score of 0.9163, indicating that it is easy to integrate rules into MRC models for improvement.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.