Abstract

Quantitative structure–property relationship (QSPR) modeling is an implementation for estimating molecular properties based on structural information, which is widely applied in exploring new solvents, pharmaceuticals, and materials with desired properties. In QSPR modeling, “simplified molecular input line-entry system” (SMILES) is a popular molecular representation with specific vocabulary and syntax. Herein, SMILES is considered a chemical language, and each SMILES notation is treated as a sentence. A deep pyramid convolutional neural network architecture is constructed for extracting the information from SMILES “sentences”, and the feed-forward neural network is used for the property correlation. A case study of predicting the logarithm values of the octanol–water partition coefficient is conducted to prove the effectiveness of the proposed philosophy. Compared with a precedent reference model, the outperformance of the developed QSPR models provides fascinating insights for applying natural language processing technologies for molecular information mining and exploration of chemical property space.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call