Accurate estimation of crop evapotranspiration (ET) is essential for the efficient utilization of agricultural water resources, crop production enhancement, and sustainable agricultural development. However, direct measurement of ET is highly expensive, intricate, and time-consuming, highlighting the imperative of establishing a novel model to accurately estimate ET in agricultural ecosystems. To address the above problems, this study proposed a novel model (GWA-CNN-BiLSTM), which incorporates Grey Wolf Algorithm (GWA), Convolutional Neural Network (CNN), and Bidirectional Long Short-Term Memory network (BiLSTM) as a hyperparameter adjuster, feature extractor, and regression component, respectively, to estimate ET built upon various input combinations comprising net solar radiation (Rn), vapor pressure deficit (VPD), average air temperature (Ta), soil water content (SWC), and leaf area index (LAI) about winter wheat-spring maize rotation system during 2012–2020 in the Loess Plateau. Besides, following a comparative assessment within GWA-CNN-BiLSTM, Convolutional Bidirectional Long Short-Term Memory network (CNN-BiLSTM), BiLSTM, Long Short-Term Memory network (LSTM), and Shuttleworth-Wallace (SW) models, the results revealed that GWA-CNN-BiLSTM under varied inputs obtained the superior performance, ranging from 0.562 to 0.957 in determination coefficient (R2), 8.4–41.5 % in relative root mean square error (RRMSE), 0.349 mm d−1 to 1.521 mm d−1 in mean absolute error (MAE), −3.26 % to 14.11 % in percent bias (PBIAS), and 0.820–1.091 in regression coefficient (b0), respectively. Moreover, while the accuracy of BiLSTM over LSTM was evident, its performance was notably improved by the incorporation of the CNN module. Additionally, LSTM-type models under complete input combination present better precision than SW by 29.7−51.4 % in R2, 44.2−76.1 % in RRMSE, and 33.6−63.4 % in MAE, respectively. Furthermore, the accuracy of all models under varied inputs exhibited excellence in winter wheat compared to spring maize, and corresponding improvements ranged 1.4−4.3 % in R2, 5.1−20.1 % in RRMSE, and 3.1−17.9 % in MAE, respectively. Besides, the meteorological factors (Rn, VPD, Ta) proved to be the most important inputs for ET estimation in winter wheat and spring maize. Wherein the importance of SWC exceeded that of LAI in winter wheat, while the opposite trend was observed in spring maize. In brief, GWA-CNN-BiLSTM is the highly recommended model to estimate ET of winter wheat-spring maize rotation system under diverse input data scenarios in the Loess Plateau, which can facilitate to offer valuable assistance in regional agriculture water management decisions.