Toward an Explainable Large Language Model for the Automatic Identification of the Drug-Induced Liver Injury Literature.

Chunwei Ma,Russell D Wolfinger

doi:10.1021/acs.chemrestox.4c00134

Abstract

Drug-induced liver injury (DILI) stands as a significant concern in drug safety, representing the primary cause of acute liver failure. Identifying the scientific literature related to DILI is crucial for monitoring, investigating, and conducting meta-analyses of drug safety issues. Given the intricate and often obscure nature of drug interactions, simple keyword searching can be insufficient for the exhaustive retrieval of the DILI-relevant literature. Manual curation of DILI-related publications demands pharmaceutical expertise and is susceptible to errors, severely limiting throughput. Despite numerous efforts utilizing cutting-edge natural language processing and deep learning techniques to automatically identify the DILI-related literature, their performance remains suboptimal for real-world applications in clinical research and regulatory contexts. In the past year, large language models (LLMs) such as ChatGPT and its open-source counterpart LLaMA have achieved groundbreaking progress in natural language understanding and question answering, paving the way for the automated, high-throughput identification of the DILI-related literature and subsequent analysis. Leveraging a large-scale public dataset comprising 14 203 training publications from the CAMDA 2022 literature AI challenge, we have developed what we believe to be the first LLM specialized in DILI analysis based on LLaMA-2. In comparison with other smaller language models such as BERT, GPT, and their variants, LLaMA-2 exhibits an enhanced out-of-fold accuracy of 97.19% and area under the ROC curve of 0.9947 using 3-fold cross-validation on the training set. Despite LLMs' initial design for dialogue systems, our study illustrates their successful adaptation into accurate classifiers for automated identification of the DILI-related literature from vast collections of documents. This work is a step toward unleashing the potential of LLMs in the context of regulatory science and facilitating the regulatory review process.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Toward an Explainable Large Language Model for the Automatic Identification of the Drug-Induced Liver Injury Literature.

Abstract

Talk to us

Similar Papers

More From: Chemical research in toxicology

Lead the way for us

Similar Papers

Acute Liver Failure: Indian Perspective.
Subrat Kumar Acharya
Clinical Liver Disease | VOL. 18
Subrat Kumar AcharyaSubrat Kumar Acharya
22 Jul 2021
Clinical Liver Disease | VOL. 18

How Can IJDS Authors, Reviewers, and Editors Use (and Misuse) Generative AI?
Galit Shmueli ... Bianca Maria Colosimo
INFORMS Journal on Data Science | VOL. 2
Galit Shmueli, et. al.Galit Shmueli ... Bianca Maria Colosimo
01 Apr 2023
INFORMS Journal on Data Science | VOL. 2

A Review of Current Trends, Techniques, and Challenges in Large Language Models (LLMs)
Rajvardhan Patil ... Venkat Gudivada
Applied Sciences | VOL. 14
Rajvardhan Patil, et. al.Rajvardhan Patil ... Venkat Gudivada
01 Mar 2024
Applied Sciences | VOL. 14

#2924 Comparison of large language models and traditional natural language processing techniques in predicting arteriovenous fistula failure
Suman Lama ... Luca Neri
Nephrology Dialysis Transplantation | VOL. 39
Suman Lama, et. al.Suman Lama ... Luca Neri
23 May 2024
Nephrology Dialysis Transplantation | VOL. 39

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Toward an Explainable Large Language Model for the Automatic Identification of the Drug-Induced Liver Injury Literature.

Abstract

Talk to us

Similar Papers

More From: Chemical research in toxicology