Abstract. Since 1999, Environment and Climate Change Canada (ECCC) has been coordinating a multi-laboratory comparison of measurements of long-lived greenhouse gases in whole air samples collected at the Global Atmosphere Watch (GAW) Alert Observatory located in the Canadian High Arctic (82∘28′ N, 62∘30′ W). In this paper, we evaluate the measurement agreement of atmospheric CO2, CH4, N2O, SF6, and stable isotopes of CO2 (δ13C, δ18O) between leading laboratories from seven independent international institutions. The measure of success is linked to target goals for network compatibility outlined by the World Meteorological Organization's (WMO) GAW greenhouse gas measurement community. Overall, based on ∼ 8000 discrete flask samples, we find that the co-located atmospheric CO2 and CH4 measurement records from Alert by CSIRO, MPI-BGC, SIO, UHEI-IUP, and ECCC versus NOAA (the designated reference laboratory) are generally consistent with the WMO compatibility goals of ± 0.1 ppm CO2 and ± 2 ppb CH4 over the 17-year period (1999–2016), although there are periods where differences exceed target levels and persist as systematic bias for months or years. Consistency with the WMO goals for N2O, SF6, and stable isotopes of CO2 (δ13C, δ18O) has not been demonstrated. Additional analysis of co-located comparison measurements between CSIRO and SIO versus NOAA or INSTAAR (for the isotopes of CO2) at other geographical sites suggests that the findings at Alert for CO2, CH4, N2O, and δ13C–CO2 could be extended across the CSIRO, SIO, and NOAA observing networks. The primary approach to estimate an overall measurement agreement level was carried out by pooling the differences of all individual laboratories versus the designated reference laboratory and determining the 95th percentile range of these data points. Using this approach over the entire data record, our best estimate of the measurement agreement range is −0.51 to +0.53 ppm for CO2, −0.09 ‰ to +0.07 ‰ for δ13C, −0.50 ‰ to +0.58 ‰ for δ18O, −4.86 to +6.16 ppb for CH4, −0.75 to +1.20 ppb for N2O, and −0.14 to +0.09 ppt for SF6. A secondary approach of using the average of 2 standard deviations of the means for all flask samples taken in each individual sampling episode provided similar results. These upper and lower limits represent our best estimate of the measurement agreement at the 95 % confidence level for these individual laboratories, providing more confidence for using these datasets in various scientific applications (e.g., long-term trend analysis).