With the rapid development of e-commerce, precise demand forecasting and efficient inventory management have become essential for the success and profitability of retail businesses. This study focuses on demand forecasting for e-commerce retailers using the Seasonal Autoregressive Integrated Moving Average (SARIMA) model and the K-means clustering algorithm. The research utilizes a dataset containing 1996 time series of sales data from various products, merchants, and warehouses, aiming to predict demand changes for the next 15 days. The study initially evaluates three models—Linear Regression (LR), Autoregressive Integrated Moving Average (ARIMA), and SARIMA—by fitting them to historical sales data to forecast future demand. The SARIMA model is identified as the most effective through rigorous evaluation using 1-mWAPE (mean weighted absolute percentage error) and RMSE (root mean square error) metrics. To enhance homogeneity within demand categories, the K-means clustering method is applied to divide products into four distinct groups, further refining the forecasting process.The paper also addresses the challenge of integrating new sequences into the dataset by leveraging clustering results to classify sequences and using cosine similarity to identify analogous historical time series. These matched sequences serve as the basis for demand prediction using the established SARIMA model. The findings highlight the robustness of the SARIMA model in capturing trends and seasonality, providing a reliable framework for e-commerce demand forecasting that can significantly impact inventory strategies and operational efficiency.
Read full abstract