The magnesium and strontium contents of fossil ostracod-shell calcite have been used extensively in the reconstruction of water composition and temperature in lakes and, to a lesser degree, in marginal-marine waters and in the deep ocean. Current instrumentation may allow single shells of these calcite microfossils to be analysed and most studies have taken advantage of this capability. However, there is now good evidence for large intra-sample variability in the trace-element content of ostracod shells, both in living populations and in fossil assemblages, and this has implications for any attempt to distinguish the low-frequency (∼decadal or longer) trends in water temperature and composition that are often the goals of palaeoenvironmental studies, from higher-frequency (seasonal or interannual) ‘noise’. In this study, the Mg and Sr contents of living and fossil populations of six species of marine, marginal-marine and shallow-marine ostracods from 11 sites are used to investigate sources of variability and to estimate critical sample sizes for stratigraphic studies. Results confirm that variability amongst individual samples is generally wide and cannot be attributed to instrumental error or sample handling alone. In some instances, variability can be explained by fluctuations in water composition and/or temperature, although this has to be evaluated in each case. Some evidence points to non-environmental control on variability in shell composition, perhaps connected to genetically-controlled differences in physiology that affected calcification, early diagenetic alteration, or the role of environmental factors other than temperature or water composition. However, further work is needed to verify this. In some stratigraphic studies, large numbers (>20) of individual shells would need to be analysed in order to detect low-frequency environmental changes with any degree of confidence, although fewer shells may often suffice. The critical sample size depends on the variability at each study site, which should ideally be evaluated using pilot studies.