Data-driven artificial neural networks (ANNs) demonstrably offer numerous advantages over conventional deterministic methods in a wide range of geophysical problems. For seismic velocity model building, judiciously trained ANNs offer the possibility of estimating subsurface velocity models; however, there are substantial challenges with effective and efficient network training. Motivated by the multi-scale approach commonly used to address FWI non-linearity challenges, we develop a frequency-stepping velocity model building approach that uses a sequence-to-sequence recurrent neural network (RNN) with built-in long short-term memory (LSTM). The input sequences to the LSTM-RNN consist of the frequency-domain seismic data ordered by frequency from lowest available to highest usable or chosen, while the corresponding output sequences are frequency-dependent smoothed velocity models. We compare models recovered using the trained RNN to the true models qualitatively and quantitatively. The normalized root-mean-square (NRMS) misfit between the true and predicted models has a mean of 6%, which confirms that the network recovers highly accurate models.