Multi-camera video surveillance environment has a variety of emerging research problems among, which person re-identification is the premier one. Unsupervised person re-identification has been explored less in literature than the supervised approach. Images acquired from the video surveillance systems are unlabeled, which denotes that it is naturally an unsupervised learning problem. The state-of-the-art unsupervised methods seek external annotations support such as incorporating transfer learning techniques, partial labeling of train images, etc., which makes them not purely unsupervised and unsuitable for practical real-world surveillance settings. Identity mismatch happens due to the similar costumes and complex environmental factors. To resolve this issue, we introduce a new framework named Spatio-Temporal Association Rule based Deep Annotation-free Clustering (STAR-DAC) which incrementally clusters the unlabeled person re-identification images based on visual features and performs cluster fine-tuning through the mined spatio-temporal association rules. STAR formulations leveraged upto 75% of images for reliable sample selection through cluster fine-tuning. STAR based fine-tune algorithm aims to attain ground-truth labels of an unlabeled dataset and eliminate cluster outliers to stabilize the evaluation. Experiments are performed on image and video-based benchmark person re-identification datasets such as DukeMTMC re-ID, Market1501, MSMT17, CUHK03, GRID and Dukevideo re-ID, iLIDSVid, ViPer respectively. Experimental results clearly show that the proposed STAR-DAC framework outperforms the state-of-the-art methods in case of large scale datasets with multiple cameras.