The results usually reported in corpus-linguistic studies are quantitative: frequencies, percentages, model parameters, etc. However, given that no corpora are alike, and that sometimes different results are reported for very similar corpora (or even the same corpus), three central issues are: (i) how to identify and quantify the degree of variation coming with one's results; (ii) how to investigate the source of the observed variation in corpora; and, (iii) how homogeneous one's corpus is with respect to a particular phenomenon.In this paper, I shall present a methodology that addresses these issues, providing data from ICE-GB on the frequency of the English present perfect, the alternation of transitive phrasal verbs and the semantics of the English ditransitive. Specifically, I will show how applying resampling methods and exploratory data analysis to corpus data allows for, (i) providing interval estimates for one's findings that show how superficially different results may reflect similar underlying tendencies; (ii) determining communicative dimensions underlying variation in a bottom-up fashion (similar to work by Biber, but based on just the phenomenon one is interested in); and, (iii) quantifying the homogeneity of the corpus with respect to the phenomena one is actually interested in (rather than by the standard approach of using word frequencies).For every parameter we estimate from data, we need to establish an unreliability estimate. We use this to judge the uncertainty associated with any inferences we may want to make about our point estimate, and to establish a confidence interval for the true value of the parameter. Up to now, we have used parametric measures like standard errors that are based on the assumption of normality of errors […]. If the assumption of normality is wrong, then our unreliability estimates will also be wrong, but it is hard to know how wrong they will be, using standard analytical methods. An alternative way of establishing unreliability estimates is to resample our data […](Crawley, 2002: 195; emphasis as original)