LSTM networks are popular for predicting data with nonlinear and temporal properties. However, it is difficult to select optimal hyperparameters using empirical methods, which can significantly affect their performance and modeling time. To address this, we propose a novel hybrid model called CMAL-WOA-LSTM (CWLM), which utilizes the multi-strategy improved whale optimization algorithm (WOA) to optimize three key hyperparameters of LSTM. Four modifications are introduced to improve the performance of WOA. Circle chaotic map is used for population initialization, and a modified dynamic backward learning strategy improves population diversity. A nonlinear function optimizes iterations to allow global exploration and faster convergence. Lévy Flight updates of feasible solutions using random walks are carried out near the optimal value for each iteration. By conducting benchmarks and comparative analysis, we illustrate the effectiveness and rationale behind the four improvements. Subsequently, we explain our optimization ideas for constructing hybrid models, highlighting their distinctions from traditional deep learning approaches. Moreover, we provide detailed modeling steps for CWLM and elaborate on the relationships of each part within the model. CWLM is compared with five other models using milling force data and wear data from high-speed machine tests. Root Mean Square Error, Mean Absolute Error, Mean Absolute Percentage Error, R-Squared, and computational time are used as error metrics. The results show that CWLM outperforms other models in terms of prediction performance and robustness. CWLM demonstrates improved prediction performance and robustness, making it applicable in a wide range of applications that use LSTM for data prediction.