Abstract
In spite of increasing importance of cyclic hydrocarbons in various chemical systems, studies on the fundamental properties of these compounds, such as enthalpy of formation, are still scarce. One of the reasons for this is the fact that the estimation of the thermodynamic properties of cyclic hydrocarbon species via cost-effective computational approaches, such as group additivity (GA), has several limitations and challenges. In this study, a machine learning (ML) approach is proposed using a support vector regression (SVR) algorithm to predict the standard enthalpy of formation of cyclic hydrocarbon species. The model is developed based on a thoroughly selected dataset of accurate experimental values of 192 species collected from the literature. The molecular descriptors used as input to the SVR are calculated via alvaDesc software, which computes in total 5255 features classified into 30 categories. The developed SVR model has an average error of approximately 10 kJ/mol. In comparison, the SVR model outperforms the GA approach for complex molecules and can be therefore proposed as a novel data-driven approach to estimate enthalpy values for complex cyclic species. A sensitivity analysis is also conducted to examine the relevant features that play a role in affecting the standard enthalpy of formation of cyclic species. Our species dataset is expected to be updated and expanded as new data are available to develop a more accurate SVR model with broader applicability.
Highlights
With the development of alternative fuels coming from different sources, as well as new additives from petroleum, cyclic hydrocarbons have become important components of current and future fuels.[1,2] cyclic hydrocarbons, such as polycyclic aromatic hydrocarbons (PAH), are common intermediates in flames that lead to soot formation
A data-driven approach based on the support vector regression (SVR) algorithm was developed to predict enthalpy values for cyclic hydrocarbons with a dataset of 192 species collected from Ghahremanpour et al.,[19] CRC,[20] and Minenkov et al.[21]
Molecular descriptors from alvaDesc[24] were used as input features that are generated from the output of simplified molecular input line entry system (SMILES), which are chemical formulas encoded as text strings
Summary
With the development of alternative fuels coming from different sources, as well as new additives from petroleum, cyclic hydrocarbons have become important components of current and future fuels.[1,2] cyclic hydrocarbons, such as polycyclic aromatic hydrocarbons (PAH), are common intermediates in flames that lead to soot formation. Cyclic hydrocarbons are important in combustion chemistry and in other fields; cyclic unsaturated hydrocarbons can lead to the formation of Criegee intermediates[3] and highly oxidized organic compounds,[4] with implications in pollutant formation and climate. Knowledge of their molecular properties can help build models for atmospheric and combustion modeling. Additional data for complex cyclic hydrocarbons, saturated and unsaturated, have been derived in the recent years.[6,7]
Published Version (
Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have