Person re-identification is to match the same person between non-overlapping cameras. This paper focuses on unsupervised video-based person re-identification. The mainstream approach is to obtain pseudo-labels by clustering samples for training the classification model. In this scheme, a potential threat is that noisy pseudo-labels may damage the optimization of the model. To mitigate this danger, we propose using a Successive Consensus Clustering framework for optimizing the pseudo-labels and the model iteratively. First, we leverage consensus clustering with respect to multiple frames of a video, which can generate high-quality pseudo-labels for pedestrians. Secondly, we develop contrastive learning based on the cluster successive memory mechanism, which can establish the correlation between different epochs of clustering so that makes the training of the model stable. Experiments on three large-scale data sets show that our method outperforms the previous state-of-the-art method, surpassing 10.6% for rank-1 and 18.6% for mAP on Mars, and 9.6% for rank-1 and 13.3% for mAP on DukeMTMC-VideoReID.