Seismic full-waveform inversion (FWI) is a powerful imaging approach in exploration geophysics, to estimate high-resolution subsurface geophysical properties via fitting synthetic data to observed seismic records. Nevertheless, there exist several intractable issues, e.g., solution non-uniqueness, cycle-skipping phenomenon, weak robustness and high computational burden. Recently, more attention has been paid to data-driven FWI techniques based on deep neural networks. However, their success greatly depends on an ocean of training instances, which is not available in oil exploration conveniently. To cope with this situation, an effective strategy is to reduce the dimensionality of input and output spaces for given training instances. By exploiting 2D velocity profile nonlinear compressibility, we train a residual convolutional autoencoder under an unsupervised mode using velocity data, where its encoder acts as a velocity dimensionality reduction operator to transform 2D velocity profiles into low-dimensional 1D pseudo-velocity curves. Then, a residual network is trained to build a nonlinear mapping from multi-shot seismic records to 1D pseudo-velocity curves. In the inference stage, the trained decoder converts those predicted 1D curves back to 2D velocity profiles. Therefore, we propose a novel two-stage data-driven seismic inversion technique combining a residual network with a residual convolutional autoencoder. The detailed experiments on four synthetic data sets (salt body, layered, faulted and salt dome models) illustrate that our proposed method with three different residual network architectures outperforms other approaches including FCNVMB, VelocityGAN and pure residual network, according to four objective evaluation metrics and subjective visualization.