Chemical-Disease relation (CDR) extraction aims to identify the semantic relations between chemical and disease entities in the unstructured biomedical document, which provides a basis for downstream tasks such as clinical medical diagnosis and drug discovery. Compared with general domain relation extraction, it needs a more effective representation of the whole document due to the specialized nature of texts in the biomedical domain, including the biomedical entity and entity-pair representation. In this paper, we propose a novel Multi-view Merge Representation (MMR) model to thoroughly capture entity and entity-pair representation of the document. First, we utilize prior knowledge and a pre-trained transformer encoder to capture entity semantic representation. Then we employ the U-Net layer and Graph Convolution Network layer to capture global entity-pair representation. Finally, we get a specific merged representation for each entity pair to be classified. We evaluate our model on the CDR dataset published by the BioCreative-V community and achieve a state-of-the-art result.
Read full abstract