MITRE system for clinical assertion status classification.

Cheryl Clark,Lynette Hirschman,Matt Coarr,Alexander Yeh,John Aberdeen,David Tresner-Kirsch,Ben Wellner

doi:10.1136/amiajnl-2011-000164

Abstract

To describe a system for determining the assertion status of medical problems mentioned in clinical reports, which was entered in the 2010 i2b2/VA community evaluation 'Challenges in natural language processing for clinical data' for the task of classifying assertions associated with problem concepts extracted from patient records. A combination of machine learning (conditional random field and maximum entropy) and rule-based (pattern matching) techniques was used to detect negation, speculation, and hypothetical and conditional information, as well as information associated with persons other than the patient. The best submission obtained an overall micro-averaged F-score of 0.9343. Using semantic attributes of concepts and information about document structure as features for statistical classification of assertions is a good way to leverage rule-based and statistical techniques. In this task, the choice of features may be more important than the choice of classifier algorithm.

Full Text