High speed streams of solar wind can impact Earth’s magnetic environment. Current state of the art geomagnetic effect forecasting methods can predict perturbations up to two hours in advance by using solar wind measurements made in the L1 Lagrange point. By forecasting these solar wind parameters, it should be possible to extend the time range for which said methods can be used.We present Solar Wind Attention Network (SWAN), an attention-based autoregressive model for the forecasting of daily average bulk speed values. We apply a novel automatic design method to create its architecture by developing a pipeline of hyperparameter optimization techniques as a proof of concept. Usage of this pipeline leads to a robust, reliable architecture which we empirically verify the convergence of in a variety of problem configurations regarding solar wind speed forecasting.An improvement in terms of root mean squared error (RMSE) is found when compared to WindNet and convolutional neural networks (CNNs) expanding it, even when SWAN does not incorporate images of the solar disk into the input. SWAN’s RMSE of 68.79 ± 1.05 km s−1 with a lead time of three days compares to RMSE values in the order of 80.28 ± 3.05, 76.30 ± 1.87 and 71.93 ± 0.40 km s−1. From these results, we discuss the possibility of incorporating both autoregressive and image information as input to enhance predictive capability of newly developed models. Furthermore, we analyze the resulting model’s behavior to characterize its main error sources, and find its performance to be at its best in scenarios not involving newly formed coronal holes or coronal mass ejections. It is found that this behavior matches well with a priori expectations derived from the underlying physics. Finally, we briefly discuss the need for an in-depth study of different evaluation metrics to provide a more thorough understanding of modeling results on the field.