The analysis of large amounts of vessel trajectory data can facilitate more complex traffic management and route planning, thereby reducing the risk of accidents. The application of deep learning methods in vessel trajectory prediction is becoming more and more widespread; however, due to the complexity of the marine environment, including the influence of geographical environmental factors, weather factors, and real-time traffic conditions, predicting trajectories in less constrained maritime areas is more challenging than in path network conditions. Ship trajectory prediction methods based on kinematic formulas work well in ideal conditions but struggle with real-world complexities. Machine learning methods avoid kinematic formulas but fail to fully leverage complex data due to their simple structure. Deep learning methods, which do not require preset formulas, still face challenges in achieving high-precision and long-term predictions, particularly with complex ship movements and heterogeneous data. This study introduces an innovative model based on the transformer structure to predict the trajectory of a vessel. First, by processing the raw AIS (Automatic Identification System) data, we provide the model with a more efficient input format and data that are both more representative and concise. Secondly, we combine convolutional layers with the transformer structure, using convolutional neural networks to extract local spatiotemporal features in sequences. The encoder and decoder structure of the traditional transformer structure is retained by us. The attention mechanism is used to extract the global spatiotemporal features of sequences. Finally, the model is trained and tested using publicly available AIS data. The prediction results on the field data show that the model can predict trajectories including straight lines and turns under the field data of complex terrain, and in terms of prediction accuracy, our model can reduce the mean squared error by at least 6×10−4 compared with the baseline model.