The number of sample snapshots of array signals directly affects the performance of direction of arrival (DOA) estimation methods, and smaller snapshots often cannot represent all the features of array signals. However, in practical applications, owing to short-time abrupt changes, low intensity, large noise interference, and other factors of the target signal, the acoustic vector array sometimes cannot obtain sufficient signal data, making it difficult to achieve accurate DOA estimation. Therefore, this study proposes a transfer-learning-based DOA estimation method for acoustic vector arrays. This method extracts the spatial–temporal features of existing signal data by constructing a pre-trained network model based on a convolutional neural network (CNN) and long short-term memory (LSTM), and transfers the trained model to scenes with limited snapshot data through model fine-tuning, achieving the goal of improving the DOA estimation accuracy under a small number of snapshots. Simulation experiments show that the accuracy and RMSE of the proposed DOA estimation method are superior to those of traditional methods when only 1% of the target data are used. This indicates that the pre-training model based on LSTM and CNN can preserve the effective information of signal data and provides a new solution for the real-time prediction of acoustic vector arrays in scenes with a limited number of snapshots through transfer learning.