Accurate forecasting of international air cargo volumes is crucial for air cargo carriers, which optimizes operational efficiency and profitability and strengthens market competitiveness and risk response capabilities. This study focuses on the open-source cargo data of over 100 international airlines in India from January 2015 to March 2017. Firstly, the missing data must be dealt with. Then, histograms and probability density plots are used to depict the distribution of the cargo volume and analyze the dynamics of the cargo volume of each company at annual and quarterly levels. Subsequently, key factors affecting cargo volumes are analyzed, with a special focus on the cargo performance of the Top 5 airlines, which is visualized in box plots. Given the time-series nature of cargo data, this study adopts the data sequential partitioning method. It innovatively utilizes the Stacking ensemble learning framework, integrating regression models such as Random Forest, Decision Tree, Extreme Gradient Boosting, and K-Nearest Neighbor as the base learners. The meta-learner is constructed to integrate the outputs of each base model, and the hyperparameters of the model are optimized by grid search and cross-validation. Comparison results show that the Stacking model has a higher R² value and lower MSE than any single base learners, indicating that it is more effective in international air cargo volume prediction. The prediction model based on stacking ensemble learning provides an effective strategy for improving air cargo prediction accuracy.
Read full abstract