Bias-Free Chemically Diverse Test Sets from Machine Learning.

Ellen T Swann,Michelle L Coote,Michael Fernandez,Amanda S Barnard

doi:10.1021/acscombsci.7b00087

Abstract

Current benchmarking methods in quantum chemistry rely on databases that are built using a chemist's intuition. It is not fully understood how diverse or representative these databases truly are. Multivariate statistical techniques like archetypal analysis and K-means clustering have previously been used to summarize large sets of nanoparticles however molecules are more diverse and not as easily characterized by descriptors. In this work, we compare three sets of descriptors based on the one-, two-, and three-dimensional structure of a molecule. Using data from the NIST Computational Chemistry Comparison and Benchmark Database and machine learning techniques, we demonstrate the functional relationship between these structural descriptors and the electronic energy of molecules. Archetypes and prototypes found with topological or Coulomb matrix descriptors can be used to identify smaller, statistically significant test sets that better capture the diversity of chemical space. We apply this same method to find a diverse subset of organic molecules to demonstrate how the methods can easily be reapplied to individual research projects. Finally, we use our bias-free test sets to assess the performance of density functional theory and quantum Monte Carlo methods.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bias-Free Chemically Diverse Test Sets from Machine Learning.

Abstract

Talk to us

Similar Papers

More From: ACS Combinatorial Science

Lead the way for us

Journal: ACS Combinatorial Science	Publication Date: Jul 27, 2017
Citations: 10

Similar Papers

On the performance of density functional theory methods in the prediction of the electric polarizability and hyperpolarizability of ozone
George Maroulis
Computing Letters | VOL. 1
George MaroulisGeorge Maroulis
06 Mar 2005
Computing Letters | VOL. 1

Performance of density functional theory and orbital-optimised second-order perturbation theory methods for geometries and singlet–triplet state splittings of aryl-carbenes
Reza Ghafarian Shirazi ... Frank Neese
Molecular Physics | VOL. 118
Reza Ghafarian Shirazi, et. al.Reza Ghafarian Shirazi ... Frank Neese
18 May 2020
Molecular Physics | VOL. 118

On the accuracy of density functional theory and wave function methods for calculating vertical ionization energies.
Scott Mckechnie ... Aron J Cohen
The Journal of Chemical Physics | VOL. 142
Scott Mckechnie, et. al.Scott Mckechnie ... Aron J Cohen
21 May 2015
The Journal of Chemical Physics | VOL. 142

Ab initio quantum chemistry with neural-network wavefunctions.
Jan Hermann ... W M C Foulkes
Nature Reviews Chemistry | VOL. 7
Jan Hermann, et. al.Jan Hermann ... W M C Foulkes
09 Aug 2023
Nature Reviews Chemistry | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bias-Free Chemically Diverse Test Sets from Machine Learning.

Abstract

Talk to us

Similar Papers

More From: ACS Combinatorial Science