Abstract

The existence of massive quantity of clinical text in electronic medical records (EMRs) has created significant demand for clinical text processing and information extraction in the field of health care and medical research. Detailed clinical observations of patients are typically recorded chronologically. Temporal information in such clinical texts consist of three elements: temporal expressions, temporal events, and temporal relations. Due to the implicit expression of temporal information, lack of writing quality, and domain-specific nature in the clinical text, extraction of temporal information is much more complex than for newswire texts. In spite of these difficulties, to extract temporal information using the annotated corpora, few research works reported rule-based, machine-learning, and hybrid methods. On the other hand, creating the annotated corpora is expensive, time-consuming, and demands significant human effort; the processing quality is inevitably affected by the small size of corpora. Motivated by this issue, in this research work, we present a novel method to effectively extract the temporal information from EMR clinical texts. The essential idea of this method is first to build a feature set appropriately for clinical expressions, followed by the development of a semi-supervised framework for temporal event extraction, and finally detection of temporal relations among events with a newly formulated hypothesis. Comparative experimental evaluation on the I2B2 data set has clearly shown improved performance of the proposed methods. Specifically, temporal event and relation extraction is possible with an F-measure 89.98 and 67.1% respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call