Due to unexpected environmental variations and poor consistencies in land data acquisition, complex near-surface seismograms are usually polluted unreasonably with a low signal-to-noise ratio (SNR). These complicated circumstances bring more challenges in identifying accurate first arrivals for the following wave-equation traveltime inversion (WT). Recently, the autoencoder (AE) is a typical unsupervised learning network whose basic principle is to compress the input seismic data for their intrinsic features in the latent space with an encoder and, thereafter, to decipher these features for seismic profiles as the output with a decoder. This process is fully automatic with high stability and is not very sensitive to data quality. In this paper, we propose an elastic WT inversion algorithm based on the AE method (AEWT) to invert the P-velocity model. Compared with the standard WT, the AEWT method automatically extracts the intrinsic features of the refractions with AE as reference data for the misfit functionals. Feature images in the latent space show similar but intensified sensitivity to the traveltimes with respect to velocity perturbations. We present one synthetic and two field data tests for comparing the proposed AEWT and the standard WT tomograms to investigate the locations of a buried fault and the depth of a buried sinkhole. All these experiments demonstrate that the proposed elastic AEWT method can reduce errors caused by low SNR and obtain a more reliable and stable P-velocity tomogram.