Abstract We study the properties of long gamma-ray burst host galaxies using a statistical modelling framework derived to model damped Lyman-α absorbers (DLAs) in quasar spectra at high redshift. The distribution of $N_{\rm H\, \small {I}}$ for GRB-DLAs is ∼10 times higher than what is found for quasar-DLAs at similar impact parameters. We interpret this as a temporal selection effect due to the short-lived GRB progenitor probing its host at the onset of a starburst where the ISM may exhibit multiple over-dense regions. Owing to the larger $N_{\rm H\, \small {I}}$, the dust extinction is larger with 29 per cent of GRB-DLAs exhibiting A(V) > 1 mag in agreement with the fraction of ‘dark bursts’. Despite the differences in $N_{\rm H\, \small {I}}$ distributions, we find that high-redshift 2 < z < 3 quasar- and GRB-DLAs trace the luminosity function of star-forming host-galaxies in the same way. We propose that their differences may arise from the fact that the galaxies are sampled at different times in their star formation histories, and that the absorption sight lines probe the galaxy halos differently. Quasar-DLAs sample the full H i cross-section whereas GRB-DLAs sample only regions hosting cold neutral medium. Previous studies have found that GRBs avoid high-metallicity galaxies (∼0.5 Z⊙). Since at these redshifts galaxies on average have lower metallicities, our sample is only weakly sensitive to such a threshold. Lastly, we find that the modest detection rate of cold gas (H2 or C i) in GRB spectra can be explained mainly by a low volume filling factor of cold gas clouds and to a lesser degree by destruction from the GRB explosion itself.