Wind turbine wakes are the most significant factor affecting wind farm performance, decreasing energy production and increasing fatigue loads in downstream turbines. Wind farm turbine layouts are designed to minimize wake interactions using a suite of predictive models, including analytical wake models and computational fluid dynamics simulations (CFD). CFD simulations of wind farms are time-consuming and computationally expensive, which hinder their use in optimization studies that require hundreds of simulations to converge to an optimal turbine layout. In this work, we propose DeepWFLO, a deep convolutional hierarchical encoder–decoder neural network architecture, as an image-to-image surrogate model for predicting the wind velocity field for Wind Farm Layout Optimization (WFLO). We generate a dataset composed of image representations of the turbine layout and undisturbed flow field in the wind farm, as well as images of the corresponding wind velocity field, including wake effects generated with both analytical models and CFD simulations. The proposed DeepWFLO architecture is then trained and optimized through supervised learning with an application-tailored loss function that considers prediction errors in both wind velocity and energy production. Results on a commonly used test case show median velocity errors of 1.0%–8.0% for DeepWFLO networks trained with analytical and CFD data, respectively. We also propose a model-fusion strategy that uses analytical wake models to generate an additional input channel for the network, resulting in median velocity errors below 1.8%. Spearman rank correlations between predictions and data, which evidence the suitability of DeepWFLO for optimization purposes, range between 92.3% and 99.9%.