The rapid development of e-commerce platforms has prompted a large number of retailers to join and store goods through logistics warehouses provided by the platform, which are then centrally managed by the platform. In this context, the rise of artificial intelligence and machine learning technology has brought revolutionary impact to the e-commerce industry. Under this framework, accurate sales forecasting has become the core link. This article aims to integrate machine learning models and time series analysis, aiming to improve the accuracy of sales data prediction. This article is based on the K-Means algorithm for unsupervised multi-level clustering, determining the optimal K value through elbow rules and contour coefficients. After labeling the classification data with class labels, objectively evaluate the clustering effect by combining the Calinski Harabasz index and visualizing SKUs after each clustering. Finally, using evaluation metric 1-wmape, the performance of ARIMA-ANN, BiLSTM, GRU, CNN-LSTM, Stack LSTM, and three tree models in predicting SKU sales was compared. We compared the accuracy of 15 day recursive prediction and 15 day one-time output, and selected sliding window sizes of 3, 5, 7, and 14. After comparison, it was determined that the BiLSTM model performed best in predicting 15 day sales volume and a 3-day sliding time window under multiple sample categories; ARIMA-ANN performs the best in recursive prediction of 15 day sales volume in a few sample categories. By adjusting the ARIMA parameters through ACF and PACF, the maximum predictive performance is ensured.
Read full abstract