Abstract

Machine Learning is an artificial intelligence system, where the system has the ability to learn automatically from experience without being explicitly programmed. The learning process from Machine Learning starts from observing the data and then looking at the pattern of the data. The main purpose of this process is to make computers learn automatically. In this study, we will use Machine Learning to predict molecular atomization energy. From various methods in Machine Learning, we use two methods namely Neural Network and Extreme Gradient Boosting. Both methods have several parameters that must be adjusted so that the predicted value of the atomization energy of the molecule has the lowest possible error. We are trying to find the right parameter values for both methods. For the neural network method, it is quite difficult to find the right parameter value because it takes a long time to train the model of the neural network to find out whether the model is good or bad, while for the Extreme Gradient Boosting method the time needed to train the model is shorter, so it is quite easy to find the right parameter values for the model. This study also looked at the effects of the modification on the dataset with the output transformation of normalization and standardization then removing molecules containing Br atoms and changing the entry in the Coulomb matrix to 0 if the distance between atoms in the molecule exceeds 2 angstrom.

Highlights

  • Atomization energy of molecules plays an important role in the world today especially on compound design in chemical and pharmaceutical industries

  • Following are the results of the neural network model with dataset A containing Coulomb matrix data without modification, dataset B containing Coulomb matrix data without molecules containing Br atoms, and dataset C containing Coulomb matrix data without molecules containing Br atoms and if the distance between atoms is more from 2.0 angstroms, the entry value in the Coulomb matrix becomes 0: Table 1: The approximation results using machine learning without transformation

  • With respect to prediction accuracy, our results using Neural Network improve more than 600 root mean square error (RMSE) than our results using Extreme Gradient Boosting or XGB

Read more

Summary

Introduction

Atomization energy of molecules plays an important role in the world today especially on compound design in chemical and pharmaceutical industries. Nowadays to find out the amount of molecule atomizing energy requires a long computation time, maybe a few days, several weeks, or even several months. We have the idea to use the machine learning model to predict the atomization energy value of a molecule so that it can reduce computational time and save costs needed for the computational process. Learning model with purpose to minimize the prediction error of the molecule atomization energy value.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call