Language Models for Predicting Organic Synthesis Procedures

Mantas Vaškevičius,Jurgita Kapočiūtė-Dzikienė

doi:10.3390/app142411526

Mantas Vaškevičius, Jurgita Kapočiūtė-Dzikienė

Open Access

https://doi.org/10.3390/app142411526

Copy DOI

Export

Save

Cite

Journal: Applied Sciences	Publication Date: Dec 11, 2024
License type: CC BY 4.0

Abstract
Full-Text
Similar Papers

Abstract

Listen

In optimizing organic chemical synthesis, researchers often face challenges in efficiently generating viable synthesis procedures that conserve time and resources in laboratory settings. This paper systematically analyzes multiple approaches to efficiently generate synthesis procedures for a wide variety of organic synthesis reactions, aiming to decrease time and resource consumption in laboratory work. We investigated the suitability of different sizes of BART, T5, FLAN-T5, molT5, and classic sequence-to-sequence transformer models for our text-to-text task and utilized a large dataset prepared specifically for the task. Experimental investigations demonstrated that a fine-tuned molT5-large model achieves a BLEU score of 47.75. The results demonstrate the capability of LLMs to predict chemical synthesis procedures involving 24 possible distinct actions, many of which include various parameters like solvents, reaction agents, temperature, duration, solvent ratios, and other specific parameters. Our findings show that only when the core reactants are used as input, the models learn to correctly predict what ancillary components need to be included in the resulting procedure. These results are valuable for AI researchers and chemists, suggesting that curated datasets and large language model fine-tuning techniques can be tailored for specific reaction classes and practical applications. This research contributes to the field by demonstrating how deep-learning-based methods can be customized to meet the specific requirements of chemical synthesis, leading to more intelligent and resource-efficient laboratory processes.

Full Text

Published Version

View

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

Language Models for Predicting Organic Synthesis Procedures

Abstract

Published Version

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Digitizing protocols into single reactors for the one-pot synthesis of nanomaterials
Hsin Wang ... Leroy Cronin
Matter | VOL. 6
Hsin Wang, et. al.Hsin Wang ... Leroy Cronin
12 Jun 2023
Matter | VOL. 6

Novel Applications of Nitrous Oxide and Transition Metals for C-C and C-N Bond Formation

-

01 Jan 2015
01 Jan 2015

Universal Chemical Synthesis and Discovery with ‘The Chemputer’
Piotr S Gromski ... Leroy Cronin
Trends in Chemistry | VOL. 2
Piotr S Gromski, et. al.Piotr S Gromski ... Leroy Cronin
15 Aug 2019
Trends in Chemistry | VOL. 2

Rational Design of Benzyl-Type Protecting Groups Allows Sequential Deprotection of Hydroxyl Groups by Catalytic Hydrogenolysis
Matthew J Gaunt ... Jinquan Yu
The Journal of Organic Chemistry | VOL. 63
Matthew J Gaunt, et. al.Matthew J Gaunt ... Jinquan Yu
01 Jun 1998
The Journal of Organic Chemistry | VOL. 63

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

Language Models for Predicting Organic Synthesis Procedures

Abstract

Published Version

Talk to us

Similar Papers

More From: Applied Sciences