AbstractWhile cover crops and mixtures are increasingly used to provide ecosystem services in agroecosystems, some fundamental questions remain about how cover crop performance and composition vary in different conditions, limiting optimal cover crop use. We conducted a field experiment at a research farm in New York, including a subset of treatments in three working farm fields. We selected two common cover crops, hairy vetch (Vicia villosa Roth), a legume, and common wheat (Triticum aestivum L.), a grass, with multiple cultivars of each. We looked at the effect of cover crop composition spanning intraspecific and grass–legume mixtures on five ecosystem services: cover crop productivity, weed suppression, total biomass nitrogen, soil N retention, and long‐term N supply via legume fixed N. We did not find intraspecific diversity to have an effect on any ecosystem services we measured, nor was that response context dependent. We did observe significant ecosystem service improvements in the grass–legume mixture, though this was context dependent and the performance of the mixture varied relative to the monocultures at different farm sites. Regardless of this interaction however, the grass–legume mixture was as good as or better than either monoculture for all services and sites, except soil N accrual at one site. Consequently, increasing complexity in cover crops through grass–legume mixtures is a low risk practice that may have the potential to deliver ecosystem service outcomes greater than those of monocultures across a range of growing conditions.