Robust relational parsing over biomedical literature: extracting inhibit relations.

J Pustejovsky,B Cochran,J Zhang,J Castaño,M Kotecki

doi:10.1142/9789812799623_0034

Abstract

We describe the design of a robust parser for identifying and extracting biomolecular relations from the biomedical literature. Separate automata over distinct syntactic domains were developed for extraction of nominal-based relational information versus verbal-based relations. This allowed us to optimize the grammars separately for each module, regardless of any specific relation resulting in significantly better performance. A unique feature of this system is the use of text-based anaphora resolution to enhance the results of argument binding in relational extraction. We demonstrate the performance of our system on inhibition-relations, and present our initial results measured against an annotated text used as a gold standard for evaluation purposes. The results represent a significant improvement over previously published results on extracting such relations from Medline: Precision was 90%, Recall 57%, and Partial Recall 22%. These results demonstrate the effectiveness of a corpus-based linguistic approach to information extraction over Medline.

Full Text