SwinOCSR: end-to-end optical chemical structure recognition using a Swin Transformer

Zhanpeng Xu,Honglin Li,Jianhua Li,Shiliang Li,Zhaopeng Yang

doi:10.1186/s13321-022-00624-5

Zhanpeng Xu, Honglin Li + Show 3 more

Open Access

https://doi.org/10.1186/s13321-022-00624-5

Copy DOI

Abstract

Optical chemical structure recognition from scientific publications is essential for rediscovering a chemical structure. It is an extremely challenging problem, and current rule-based and deep-learning methods cannot achieve satisfactory recognition rates. Herein, we propose SwinOCSR, an end-to-end model based on a Swin Transformer. This model uses the Swin Transformer as the backbone to extract image features and introduces Transformer models to convert chemical information from publications into DeepSMILES. A novel chemical structure dataset was constructed to train and verify our method. Our proposed Swin Transformer-based model was extensively tested against the backbone of existing publicly available deep learning methods. The experimental results show that our model significantly outperforms the compared methods, demonstrating the model’s effectiveness. Moreover, we used a focal loss to address the token imbalance problem in the text representation of the chemical structure diagram, and our model achieved an accuracy of 98.58%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Cheminformatics	Publication Date: Jul 1, 2022
Citations: 21	License type: open-access

R Discovery Prime

R Discovery Prime

SwinOCSR: end-to-end optical chemical structure recognition using a Swin Transformer

Abstract

Talk to us

Similar Papers

More From: Journal of Cheminformatics

Lead the way for us

Similar Papers

DECIMER.ai: an open platform for automated optical chemical structure identification, segmentation and recognition in scientific publications
Kohulan Rajan ... Christoph Steinbeck
Nature Communications | VOL. 14
Kohulan Rajan, et. al.Kohulan Rajan ... Christoph Steinbeck
19 Aug 2023
Nature Communications | VOL. 14

MPOCSR: optical chemical structure recognition based on multi-path Vision Transformer
Fan Lin ... Jianhua Li
Complex & Intelligent Systems | VOL. -
Fan Lin, et. al.Fan Lin ... Jianhua Li
22 Jul 2024
Complex & Intelligent Systems | VOL. -

DECIMER-Segmentation: Automated extraction of chemical structure depictions from scientific literature
Kohulan Rajan ... Achim Zielesny
Journal of Cheminformatics | VOL. 13
Kohulan Rajan, et. al.Kohulan Rajan ... Achim Zielesny
08 Mar 2021
Journal of Cheminformatics | VOL. 13

Advancements in hand-drawn chemical structure recognition through an enhanced DECIMER architecture
Kohulan Rajan ... Christoph Steinbeck
Journal of Cheminformatics | VOL. 16
Kohulan Rajan, et. al.Kohulan Rajan ... Christoph Steinbeck
05 Jul 2024
Journal of Cheminformatics | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SwinOCSR: end-to-end optical chemical structure recognition using a Swin Transformer

Abstract

Talk to us

Similar Papers

More From: Journal of Cheminformatics