Large-scale, whole-systems interventions in health care require imaginative approaches to evaluation that go beyond assessing progress against predefined goals and milestones. This project evaluated a major change effort in inner London, funded by a charitable donation of approximately $21 million, which spanned four large health care organizations, covered three services (stroke, kidney, and sexual health), and sought to "modernize" these services with a view to making health care more efficient, effective, and patient centered. This organizational case study draws on the principles of realist evaluation, a largely qualitative approach that is centrally concerned with testing and refining program theories by exploring the complex and dynamic interaction among context, mechanism, and outcome. This approach used multiple data sources and methods in a pragmatic and reflexive manner to build a picture of the case and follow its fortunes over the three-year study period. The methods included ethnographic observation, semistructured interviews, and scrutiny of documents and other contemporaneous materials. As well as providing ongoing formative feedback to the change teams in specific areas of activity, we undertook a more abstract, interpretive analysis, which explored the context-mechanism-outcome relationship using the guiding question "what works, for whom, under what circumstances?" In this example of large-scale service transformation, numerous projects and subprojects emerged, fed into one another, and evolved over time. Six broad mechanisms appeared to be driving the efforts of change agents: integrating services across providers, finding and using evidence, involving service users in the modernization effort, supporting self-care, developing the workforce, and extending the range of services. Within each of these mechanisms, different teams chose widely differing approaches and met with differing success. The realist analysis of the fortunes of different subprojects identified aspects of context and mechanism that accounted for observed outcomes (both intended and unintended). This study was one of the first applications of realist evaluation to a large-scale change effort in health care. Even when an ambitious change program shifts from its original goals and meets unforeseen challenges (indeed, precisely because the program morphs and adapts over time), realist evaluation can draw useful lessons about how particular preconditions make particular outcomes more likely, even though it cannot produce predictive guidance or a simple recipe for success. Noting recent calls by others for the greater use of realist evaluation in health care, this article considers some of the challenges and limitations of this method in the light of this experience and suggests that its use will require some fundamental changes in the worldview of some health services researchers.