The Chinese medical sector has been somewhat lacking in knowledge graphs, a deficiency this study aims to address. By leveraging the prowess of the BERT pre-training model, a two-tier approach has been innovated that utilizes separate pre-trained encoders for both entity and relational models. These models are intricately linked: the output from the entity model seamlessly flows into the relational one, making it possible to adeptly extract entity relationships from Chinese medical texts. This research is anchored in the CMeIE dataset, sourced from the esteemed CHIP (China Health Information Processing) conference. This dataset stands as a recognized benchmark in evaluating Chinese medical texts. By harnessing this data, the methods have been rigorously tested and validated. The promising experimental results underscore the effectiveness of the approach in distilling relationships from Chinese medical literature. The implications of this research are profound. Beyond just enriching the Chinese medical domain, the boundaries of NLP technology are also being pushed. Potential applications are manifold: from constructing comprehensive Chinese medical knowledge graphs to assisting in early-stage medical diagnoses. This innovative approach not only addresses an existing gap but also sets the stage for future advancements in medical NLP.
Read full abstract