Abstract
Biomedical relation extraction (RE) is of great importance for researchers to conduct systematic biomedical studies. It not only helps knowledge mining such as knowledge graph (KG) and novel knowledge discovery, but also promotes translational applications such as clinical diagnosis, decision making and precision medicine. However, the relations between biomedical entities are complex and diverse, and the comprehensive biomedical RE is not yet well established. This paper aims to investigate and improve large-scale RE with diverse relation types and conduct usability studies with application scenarios to optimize biomedical text mining. Datasets containing 125 relation types with different entity semantic levels are constructed to evaluate the impact of entity semantic information on RE, and performance analysis was conducted on different model architectures and domain models. This study also proposed a continued pre-training strategy and integrated models with scripts into tools. Furthermore, this study applied RE to the COVID-19 corpus with article topics and application scenarios of clinical interest to assess and demonstrate its biological interpretability and usability. The performance analysis revealed that RE achieves the best performance if the detailed semantic type is provided. For a single model, PubMedBERT with our continued pre-training performed best with an F1 score of 0.8998. The usability studies on COVID-19 demonstrated the interpretability and usability of RE, and a relation graph database is constructed, which is used to reveal existing and novel drug paths with edge explanations. The models (including pre-trained and fine-tuned), integrated tool (Docker), and generated data (including COVID-19 relation graph database and drug paths) are publicly available to the biomedical text mining community and clinical researchers. This study provided a comprehensive analysis of RE with diverse relation types. The optimized RE models and tools for diverse relation types are developed, which can be widely used in biomedical text mining. Our usability studies provided a proof-of-concept demonstration of how large-scale RE can be leveraged to facilitate novel research.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.