With the development of location-based services and data collection equipment, the volume of trajectory data has been growing at a phenomenal rate. Raw trajectory data come in the form of sequences of “coordinate-time-attribute” triplets, which require complicated manual processing before they can be used in data mining algorithms. Current works have started to explore the emerging deep representation learning method, which maps trajectory sequences to vector space and applies them to various downstream applications for boosting accuracy and efficiency. In this work, we propose a universal trajectory representation learning method based on a Siamese geography-aware transformer (TRT for short). Specifically, we first propose a geography-aware encoder to model geographical information of trajectory points. Then, we apply a transformer encoder to embed trajectory sequences and use a Siamese network to facilitate representation learning. Furthermore, a joint training strategy is designed for TRT. One of the training objectives is to predict the masked trajectory point, which makes the trajectory representation robust to low sampling rates and noises. The other is to distinguish the difference between trajectories by means of contrastive learning, which makes the trajectory representation more uniformly distributed over the hypersphere. Last, we design a benchmark containing four typical traffic-related tasks to evaluate the performance of TRT. Comprehensive experiments demonstrate that TRT consistently outperforms the state-of-the-art baselines across all tasks.