Abstract
Machine learning (ML) plays a growing role in the design and discovery of chemicals, aiming to reduce the need to perform expensive experiments and simulations. ML for such applications is promising but difficult, as models must generalize to vast chemical spaces from small training sets and must have reliable uncertainty quantification metrics to identify and prioritize unexplored regions. Ab initio computational chemistry and chemical intuition alike often take advantage of differences between chemical conditions, rather than their absolute structure or state, to generate more reliable results. We have developed an analogous comparison-based approach for ML regression, called pairwise difference regression (PADRE), which is applicable to arbitrary underlying learning models and operates on pairs of input data points. During training, the model learns to predict differences between all possible pairs of input points. During prediction, the test points are paired with all training set points, giving rise to a set of predictions that can be treated as a distribution of which the mean is treated as a final prediction and the dispersion is treated as an uncertainty measure. Pairwise difference regression was shown to reliably improve the performance of the random forest algorithm across five chemical ML tasks. Additionally, the pair-derived dispersion is both well correlated with model error and performs well in active learning. We also show that this method is competitive with state-of-the-art neural network techniques. Thus, pairwise difference regression is a promising tool for candidate selection algorithms used in chemical discovery.
Full Text
Topics from this Paper
Pairwise Regression
Chemical Machine Learning
Ab Initio Computational Chemistry
Machine Learning
Small Training Sets
+ Show 5 more
Create a personalized feed of these topics
Get StartedTalk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Similar Papers
IEEE Access
Jan 1, 2021
Jul 22, 2021
Apr 29, 2019
Jan 1, 2020
MATEC Web of Conferences
Jan 1, 2019
IEEE Internet of Things Journal
Mar 1, 2020
IEEE Internet of Things Journal
Oct 1, 2022
Neurocomputing
May 1, 2022
ACM Transactions on Software Engineering and Methodology
Oct 23, 2023
Patterns
May 1, 2020
Dec 1, 2019
IEEE Internet of Things Journal
Sep 15, 2023
Journal of Chemical Information and Modeling
Journal of Chemical Information and Modeling
Nov 24, 2023
Journal of Chemical Information and Modeling
Nov 22, 2023
Journal of Chemical Information and Modeling
Nov 22, 2023
Journal of Chemical Information and Modeling
Nov 21, 2023
Journal of Chemical Information and Modeling
Nov 17, 2023
Journal of Chemical Information and Modeling
Nov 16, 2023
Journal of Chemical Information and Modeling
Nov 16, 2023
Journal of Chemical Information and Modeling
Nov 15, 2023
Journal of Chemical Information and Modeling
Nov 15, 2023
Journal of Chemical Information and Modeling
Nov 14, 2023