Abstract

Molecule generation is crucial for developing materials and drugs while exploring the huge and discrete chemical space. Targeted at the electret material design, 53000 amines are sampled from the PubChem for training deep generative models. It is found that sequence-based model struggles to comprehend the syntax of SMILES strings and the sensitivity to slight change of strings limits the zoom-in search of electret molecules with similar substructures. Besides, although early stopping is used, GAN-based likelihood-free molecule generation suffers from mode collapse with less stable training process. In comparison, the variational auto-encoder (VAE) framework renders better and smoother generative results where discrete representation of a molecule is encoded into a real-valued multidimensional continuous vector.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call