Foundational model aided automatic high-throughput drug screening using self-controlled cohort study.

Shenbo Xu,Raluca Cobzaru,Stan N Finkelstein,Roy E Welsch,Kenney Ng,Lefkos Middleton

doi:10.1101/2024.08.04.24311480

Abstract

Developing medicine from scratch to governmental authorization and detecting adverse drug reactions (ADR) have barely been economical, expeditious, and risk-averse investments. The availability of large-scale observational healthcare databases and the popularity of large language models offer an unparalleled opportunity to enable automatic high-throughput drug screening for both repurposing and pharmacovigilance. To demonstrate a general workflow for automatic high-throughput drug screening with the following advantages: (i) the association of various exposure on diseases can be estimated; (ii) both repurposing and pharmacovigilance are integrated; (iii) accurate exposure length for each prescription is parsed from clinical texts; (iv) intrinsic relationship between drugs and diseases are removed jointly by bioinformatic mapping and large language model - ChatGPT; (v) causal-wise interpretations for incidence rate contrasts are provided. Using a self-controlled cohort study design where subjects serve as their own control group, we tested the intention-to-treat association between medications on the incidence of diseases. Exposure length for each prescription is determined by parsing common dosages in English free text into a structured format. Exposure period starts from initial prescription to treatment discontinuation. A same exposure length preceding initial treatment is the control period. Clinical outcomes and categories are identified using existing phenotyping algorithms. Incident rate ratios (IRR) are tested using uniformly most powerful (UMP) unbiased tests. We assessed 3,444 medications on 276 diseases on 6,613,198 patients from the Clinical Practice Research Datalink (CPRD), an UK primary care electronic health records (EHR) spanning from 1987 to 2018. Due to the built-in selection bias of self-controlled cohort studies, ingredients-disease pairs confounded by deterministic medical relationships are removed by existing map from RxNorm and nonexistent maps by calling ChatGPT. A total of 16,901 drug-disease pairs reveals significant risk reduction, which can be considered as candidates for repurposing, while a total of 11,089 pairs showed significant risk increase, where drug safety might be of a concern instead. This work developed a data-driven, nonparametric, hypothesis generating, and automatic high-throughput workflow, which reveals the potential of natural language processing in pharmacoepidemiology. We demonstrate the paradigm to a large observational health dataset to help discover potential novel therapies and adverse drug effects. The framework of this study can be extended to other observational medical databases.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Foundational model aided automatic high-throughput drug screening using self-controlled cohort study.

Abstract

Talk to us

Similar Papers

More From: medRxiv : the preprint server for health sciences

Lead the way for us

Journal: medRxiv : the preprint server for health sciences	Publication Date: Sep 16, 2024
License type: CC BY-NC 4.0

Similar Papers

Validity of using UK primary care electronic health records to study migration and health: a population-based cohort study
Neha Pathak ... Robert Aldridge
The Lancet | VOL. 394
Neha Pathak, et. al.Neha Pathak ... Robert Aldridge
01 Nov 2019
The Lancet | VOL. 394

Comparison of antibiotic prescribing records in two UK primary care electronic health record systems: cohort study using CPRD GOLD and CPRD Aurum databases
Martin C Gulliford ... Thamina Anjuman
BMJ Open | VOL. 10
Martin C Gulliford, et. al.Martin C Gulliford ... Thamina Anjuman
01 Jun 2020
BMJ Open | VOL. 10

Implications of large language models such as ChatGPT for dental medicine.
Florin Eggmann ... Nicola U Zitzmann
Journal of Esthetic and Restorative Dentistry | VOL. 35
Florin Eggmann, et. al.Florin Eggmann ... Nicola U Zitzmann
05 Apr 2023
Journal of Esthetic and Restorative Dentistry | VOL. 35

Zero-shot learning to extract assessment criteria and medical services from the preventive healthcare guidelines using large language models.
Xiao Luo ... Susan Storey
Journal of the American Medical Informatics Association : JAMIA | VOL. 31
Xiao Luo, et. al.Xiao Luo ... Susan Storey
20 Jun 2024
Journal of the American Medical Informatics Association : JAMIA | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Foundational model aided automatic high-throughput drug screening using self-controlled cohort study.

Abstract

Talk to us

Similar Papers

More From: medRxiv : the preprint server for health sciences