Abstract The number of biomedical publications is growing at an accelerated speed. This ever-increasing amount of scientific literature has made reading all the published articles regularly impossible even for a very specific research area. A solid grasp of existing literature is essential for coming out with novel and plausible scientific ideas. To bridge the gap between the published scientific findings and our incapability of manually processing them, we need to convert the unstructured text into structured form to enable automated methods to use the structured, machine-readable information to generate novel hypotheses, which can then be manually validated. A plausible approach for converting unstructured text into structured form is to use named entity recognition (NER) and relation extraction (RE) methods to identify the biological entities and extract their relations to construct knowledge graphs (KGs). KGs can link concepts within existing research to allow researchers to find connections that may have been difficult to discover without them. The LitCoin Natural Language Processing (NLP) Challenge was recently organized by NCATS of NIH and NASA to spur innovation by rewarding the most creative and high-impact uses of biomedical, publication-free text to create KGs. Our team participated in the challenge and ranked first place. Using the pipelines developed for the LitCoin NLP challenge, we have constructed the largest-scale biomedical KG using all PubMed articles. We further develop advanced deep-learning methods to predict new links from the constructed KG. We demonstrate the power of this new framework using several examples important for drug discovery. Citation Format: Yuan Zhang, Feng Pan, Xin Sui, Donghu Sun, Menghan Chung, Jinfeng Zhang. Constructing the largest-scale biomedical knowledge graph using all PubMed articles and its application in automated knowledge discovery. [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 5366.
Read full abstract