Precise long-term streamflow prediction has always been important in the hydrology field, and has provided essential information for efficient water-resource management and disaster prevention. Attention to this field has increased recently owing to water and climate crises. Despite the remarkable improvements in existing data-driven models, they still have weaknesses, particularly for multistep predictions in poorly gauged basins. The purpose of this study is to improve the multistep ahead prediction ability by considering mesoscale hydroclimate data as booster predictors and employing attention-based deep learning. Therefore, a meta-algorithm is proposed for analyzing time-series data in complex geo-spatiotemporal environments to assess the superiority of prediction with booster predictors through direct and direct-recursive hybrid strategies. In the subsequent stage, a novel, integrated neural network architecture is demonstrated that couples a deep convolutional neural network (CNN) with a deep attention network. In particular, the former network performs automatic feature engineering, and the latter focuses on lengthy sequence complexities. Four state-of-the-art combinations for 12-month ahead prediction are introduced, including pairs of TimeDistributed-CNN (TD-CNN) and 3D-CNN along with a Long- and Short-term Time-series network (LSTNet) or a Transformer network. Moreover, a base architecture was employed for model comparison that contains the ConvLSTM2D layer that is compatible with multidimensional time-series data. For this, all models were applied to the Karkheh River basin in the northeast of the Persian Gulf, where monthly historical records of streamflow are available from 1955 to 2021. The results revealed that the application of hydroclimate sea surface temperature data and mean surface level pressure along with local data increased the prediction accuracy. Moreover, the proposed integrated networks delivered more accurate long-term streamflow predictions than the base models through various evaluation criteria, including r, R2, mean absolute error, root-mean-square error, Kling–Gupta efficiency, Willmott's Index, Legates–McCabe's, and the Akaike information criterion. In summary, the 3D-CNN–Transformer achieved the best performance, followed by the TD-CNN–Transformer, TD-CNN–LSTNet, and 3D-CNN–LSTNet with R2 values equal to 0.952, 0.930, 0.900, and 0.837, respectively. This study demonstrates that the application of hydroclimate data with proposed integrated networks are particularly useful for poorly gauged basin. Thus, the proposed models can potentially improve multistep ahead streamflow prediction compared to univariate and equation-based models.