The importance of monitoring the activities of construction equipment for evaluating their productivity has resulted in the development of many vision-based automated monitoring methods. The state-of-the-art construction equipment activity recognition methods are based on the supervised learning approach that requires large, labeled datasets for each equipment and activity. Recently, many self-supervised deep learning methods have been proposed, which exploit the abundant unlabeled data to alleviate the data annotation cost by creating labels from the input data itself. However, the assumption of availability of abundant unlabeled data limits the applicability of self-supervised methods in the area of construction equipment activity recognition. To address these problems, in this work we propose CVRLoLD, which stands for Contrastive Video Representation Learning on Limited Dataset. CVRLoLD is a self-supervised contrastive learning approach that can successfully learn to recognize construction equipment activities on a limited dataset while only a portion of the dataset is annotated. The objectives of this work are: (1) proposing a novel self-supervised method for excavator activity recognition, and (2) improving the applicability of the self-supervised learning method on the relatively small datasets available for construction equipment activity recognition. Initially, the proposed method trains a backbone network using contrastive learning on the unlabeled data. Afterwards, the labeled data are used to fine-tune the pretrained backbone. The proposed method achieved an activity recognition accuracy of 81.7% while using only 30% of the labels in the dataset. The results demonstrate the potential of the proposed method for reducing the time and efforts required for data labeling while achieving high performance on the relatively limited datasets available in the construction domain.