Abstract

With the rapid development of deep learning, research on quantitative structure–property relationships based on deep learning has received widespread attention. The deep learning architecture combining Bidirectional Encoder Representation from Transformers (BERT) and Feedforward Neural Networks (FNN) is proposed to compare the performance of different tokenization algorithms. And t-distributed stochastic neighbor embedding reveals valuable information about the mechanism of structure–property relationships. Additionally, a deep learning framework, BERT-Convolutional Neural Network (CNN)-FNN, is developed based on the optimal tokenization algorithm to accurately predict the σ-profile and VCOSMO. The molecular structures are vectorized with the BERT model capturing local and global features of the entire molecule. And the CNN model enhances the latent representation associated with molecular properties, while the FNN model establishes the correlation. The deep learning frameworks predict σ-profile and VCOSMO properties with R2 greater than 0.9703, making it a promising intelligent tool for guiding solvent design and screening.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call