Learning-based methods have become popular in co-salient object detection (CoSOD). However, existing methods suffer from two challenging issues, including mining the inter-image co-attention and calibrating the intra-image salient objects. Moreover, the training data is insufficient. To address these challenges, we propose an end-to-end network using Two-stage Co-attention mining and Individual Calibration (TCIC) to predict the co-salient objects. Firstly, a two-stage co-attention mining architecture (TCM), including a classified co-attention module (CCM) and a focal co-attention module (FCM), is designed to model inter-image relationships. In the first stage, a CCM is applied to capture the classification interactions of multiple images, tentatively extracting the co-attention. In the second stage, we propose an FCM to adaptively suppress and aggregate multiple salient features, aiming to recalibrate the co-attention in the first stage. Secondly, considering the shape features and location information offered by the boundary features, an edge guidance module (EGM) is embedded into the individual calibration architecture (ICA) to calibrate individuals. Besides, we also adopt a co-attention transfer strategy (CTS) to keep the consistency of the co-attention during feature transfer in the decoder. Finally, TCM and ICA are integrated into a unified end-to-end framework to predict fine-grained boundary-preserving results. Besides, an image fusion algorithm (IFA) is tailored without extra pixel-level annotations for automatic generation of the composite images, aiming to supplement the training dataset. Experimental results on three prevailing testing datasets show the superiority of the proposed method in terms of various evaluation metrics.