Abstract

South Korea has developed road-based transportation and uses a lot of gasoline. South Korea imports gasoline since it is not produced domestically. So, fluctuations in gasoline prices have a significant impact on the national economy. Currently, gasoline orders, which are based on gasoline consumption, are analyzed in relation to fluctuations in gasoline prices. However, gasoline orders can also change due to various non-price factors. Therefore, to understand the trend of gasoline orders, it is important to identify additional factors that gas stations consider when determining orders. We collected 180 monthly samples of data on 167 variables. Sudden international issues lead to rapid fluctuations in gasoline orders, which can lead to outliers. A class imbalance occurs because outliers are generally fewer in number than the normal data points. Therefore, to address the class imbalance, we proposed a method that grouped the data samples into 11 clusters using the K-means clustering algorithm and then augmented the data into 85 datasets in each cluster through the Variational Auto-Encoder. We evaluated the augmented datasets through the R-Squared, Root Mean Squared Errors, and accuracy of various regression models. Based on the experimental results, when predicting gasoline orders at gas stations in South Korea using augmented datasets, linear regression showed the best performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call