Abstract

Several binary molecular fingerprints were compressed using an autoencoder neural network. We analyzed the impact of compression on fingerprint performance in downstream classification and regression tasks. Classifiers trained on compressed fingerprints were negligibly affected. Regression models benefitted from compression, especially of long fingerprints (Morgan, RDK). However, their performance dropped rapidly for compression levels exceeding 90%. Property co-learning positively influenced the predictive power of the compressed fingerprints, with a mean score improvement up to 20%, suggesting that autoencoder compression with property co-learning biases the molecular representation toward the predicted target, facilitating downstream training.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call