Large-scale longitudinal neuroimaging studies with diffusion imaging techniques are necessary to test and validate models of white matter neurophysiological processes that change in time, both in healthy and diseased brains. The predictive power of such longitudinal models will always be limited by the reproducibility of repeated measures acquired during different sessions. At present, there is limited quantitative knowledge about the across-session reproducibility of standard diffusion metrics in 3T multi-centric studies on subjects in stable conditions, in particular when using tract based spatial statistics and with elderly people. In this study we implemented a multi-site brain diffusion protocol in 10 clinical 3T MRI sites distributed across 4 countries in Europe (Italy, Germany, France and Greece) using vendor provided sequences from Siemens (Allegra, Trio Tim, Verio, Skyra, Biograph mMR), Philips (Achieva) and GE (HDxt) scanners. We acquired DTI data (2×2×2mm3, b=700s/mm2, 5 b0 and 30 diffusion weighted volumes) of a group of healthy stable elderly subjects (5 subjects per site) in two separate sessions at least a week apart. For each subject and session four scalar diffusion metrics were considered: fractional anisotropy (FA), mean diffusivity (MD), radial diffusivity (RD) and axial (AD) diffusivity. The diffusion metrics from multiple subjects and sessions at each site were aligned to their common white matter skeleton using tract-based spatial statistics. The reproducibility at each MRI site was examined by looking at group averages of absolute changes relative to the mean (%) on various parameters: i) reproducibility of the signal-to-noise ratio (SNR) of the b0 images in centrum semiovale, ii) full brain test–retest differences of the diffusion metric maps on the white matter skeleton, iii) reproducibility of the diffusion metrics on atlas-based white matter ROIs on the white matter skeleton. Despite the differences of MRI scanner configurations across sites (vendors, models, RF coils and acquisition sequences) we found good and consistent test–retest reproducibility. White matter b0 SNR reproducibility was on average 7±1% with no significant MRI site effects. Whole brain analysis resulted in no significant test–retest differences at any of the sites with any of the DTI metrics. The atlas-based ROI analysis showed that the mean reproducibility errors largely remained in the 2–4% range for FA and AD and 2–6% for MD and RD, averaged across ROIs. Our results show reproducibility values comparable to those reported in studies using a smaller number of MRI scanners, slightly different DTI protocols and mostly younger populations. We therefore show that the acquisition and analysis protocols used are appropriate for multi-site experimental scenarios.