Equifinality and premature convergence can result in considerable errors when simultaneously characterizing groundwater contamination sources and estimating contaminant transport parameters. To resolve this problem, we design a sensitivity-dependent progressive optimization system embedding ensemble-learning technique. To avoid repetitive CPU-demanding model evaluations in Sobol’ global sensitivity analysis and swarm intelligence optimization inverse modeling, Kriging, support vector regression (SVR), kernel extreme learning machine (KELM), and deep convolutional neural network (DCNN) are compared and ensembled to build an accurate surrogate of the numerical model. In addition, the sensitivities of different source characteristics and contaminant transport parameters are set as important indicators to adjust the displacement vectors of the swarm in each iteration during the optimization process to achieve a balanced identification of sensitivity-varied elements. Moreover, a homotopy-based progressive searching mechanism approach to the global optimum in large areas is developed, with the aim of preventing premature convergence for multimodal search problems. The results indicate that the ensemble learning model efficiently captures the complex input-output relationship of the numerical model with an increased determination coefficient (R2 = 0.9988), while the mean relative error is limited to 0.9314%. Although the contribution of source characteristics and contaminant transport parameters to the spatial-temporal distribution of contaminants vary dramatically, the combined application of sensitivity analysis, homotopy theory, and swarm intelligence optimization provides a more stable and accurate estimation of all the elements. The mean relative error of the identification results significantly reduced from 7.2184% to 3.2718%, whereas the maximum relative error is limited to 9.9501%.