Most of the current machine learning algorithms are applied to predict the jacking force required in micro-tunneling; in contrast, few studies about long-distance, large-section jacking projects have been reported in the literature. In this study, an intelligent framework, consisting of a differential evolution (DE), a bidirectional gated re-current unit (BiGRU), and attention mechanisms was developed to automatically identify the optimal hyperparameters and assign weights to the information features, as well as capture the bidirectional temporal features of sequential data. Based on field data from a pipe jacking project crossing underneath a canal, the model’s performance was compared with those of four conventional models (RNN, GRU, BiGRU, and DE–BiGRU). The results indicated that the DE–BiGRU–attention model performed best among these models. Then, the generalization performance of the proposed model in predicting jacking forces was evaluated with the aid of a similar case at the site. It was found that fine-tuning parameters for specific projects is essential for improving the model’s generalization performance. More generally, the proposed prediction model was found to be practically useful to professionals and engineers in making real-time adjustments to jacking parameters, predicting jacking force, and carrying out performance evaluations.