Representation learning for dynamic networks is designed to learn the low-dimensional embeddings of nodes that can well preserve the snapshot structure, properties and temporal evolution of dynamic networks. However, current dynamic network representation learning methods tend to focus on estimating or generating observed snapshot structures, paying excessive attention to network details, and disregarding distinctions between snapshots with larger time intervals, resulting in less robustness for sparse or noisy networks. To alleviate these challenges, this paper proposes a contrastive mechanism for temporal representation learning on dynamic networks, inspired by the success of contrastive learning in visual and static network representation learning. This paper proposes a novel Dynamic Network Contrastive representation Learning (DNCL) model. Specifically, contrast objective functions are constructed using intra-snapshot and inter-snapshot contrasts to capture the network topology, node feature information, and network evolution information, respectively. Rather than estimating or generating ground-truth network features, the proposed approach maximizes mutual information between nodes from different time steps and views generated. The experimental results of link prediction, node classification, and clustering on several real-world and synthetic networks demonstrate the superiority of DNCL over state-of-the-art methods, indicating the effectiveness of the proposed approach for dynamic network representation learning.