Abstract

I suppose it is tempting, if you only have a hammer, you tend to see every problem as a nail Known as the law of the instrument, the metaphor does not cast a poor light on the hammer, but on our over-reliance on it. This concept is currently applicable to the field of programme evaluation in health care education. Currently, the four-level, outcomes-driven Kirkpatrick model is the dominant model used to evaluate health care education programming. Noble and rigorous pursuits to improve the use of this model include the paper in this issue by Schonrock-Adema and her colleagues.1 However, we have become over-reliant on this limited outcomes-driven model, which leads to the query; are we to continue to improve our use of Kirkpatrick (build better hammers) or do we expand our thinking, and thus our ‘toolbox’, to understand how complex programmes work to bring about both intended and unintended outcomes? I argue that we need to do both. Currently, the four-level, outcomes-driven Kirkpatrick model is the dominant model used to evaluate health care education programming Allegiance to the Kirkpatrick model is understandable. Undeniably influenced by the works of Ralph Tyler, the culture of programme evaluation in 1959 was one that valued measureable, predetermined outcomes as the means of rendering judgement on the merit or worth of an educational programme. Formal public education evaluation efforts valued knowledge test scores to measure if short-term learning outcomes were achieved. This practice of programme evaluation no doubt influenced evaluation efforts of Kirkpatrick in the private sector. Dixon brought the model to evaluating health professions education and in the last 35 years the model has become ingrained in the culture of evaluation of health care education programming. To date, the model has been used to evaluate hundreds of health care education programmes and the measurement at all four levels is still considered the reference standard in programme evaluation.2 Although not Kirkpatrick's original intent, the model's shortcomings lie in its conceptualisation as causal; that outcomes at each level can predict outcomes at the so-called ‘higher and more valuable’ levels. With this conceptualisation, the model is flawed. Recent work in the field of organisational development found that changes in Level 3 outcomes were better predicted by factors external to the training itself.3 Challenges with this model have been known for some time, so it is perhaps not surprising that efforts continue to improve it; arguing that Level 1 should measure motivation and engagement rather than reaction,4 methods to measure Level 1 can be improved1 and Level 3 should measure performance rather than behaviour.5 Furthermore, variations of the model are countless. The model has been revised to meet the evaluation needs of interprofessional education, simulation education and population health services programming, to name only a few. Despite efforts to improve or adapt this model, our attempts to measure outcomes at the highest levels are mixed at best.6 Challenges with [the Kirkpatrick] model have been known for some time, so it is perhaps not surprising that efforts continue to improve it A recent conversation in the field of medical education can shed light on why this is. It has been argued that medical education programming and the system in which it lives is complex and that we need to consider alternative paradigms in medical education research that reflect that complexity.7 Considering this thought, it is therefore understandable that a model conceived to measure short-term, quantifiable economic outcomes is insufficient in its ability to enhance our understanding of the process by which complex programming works to bring about longer-term outcomes.8 Medical education programming and the system in which it lives is complex and we need to consider alternative paradigms that reflect that complexity This is an invitation for us to explore and use other models and frameworks that reflect an expanded view programme evaluation and build a better ‘toolbox’; one that embraces the concept of complexity and considers other factors present in the system in which a programme lives. The good news is that the field of programme evaluation learned this lesson many years ago. In response to the launch of the Sputnik satellite in the 1950s, the US Government engaged in system-wide curriculum reform and created legislation that mandated the evaluation of this new curricula.9 The failure of these outcomes-driven evaluation efforts to generate information that was useful for curriculum developers contributed to a watershed decade in the 1960s around new ways to think about and practice evaluation. Frameworks emerged that measured both a programme's processes and outcomes, examined how the programme worked to bring about observed outcomes and helped stakeholders make decisions about their programmes by seamlessly interweaving evaluation and programme development.10 The concept of complexity in programme evaluation emerged in the 1990s. Michael Patton, an advocate of complexity thinking in programme evaluation, argues, if ‘causality is the relationship between mosquitos and mosquito bites’, then our questions in the field of programme evaluation need to shift from questions of attribution (‘can we attribute behaviour change to our programme?’ or ‘did we meet our intended outcomes?’) to questions of contribution (‘what role did our programme play in the outcomes that we are noticing?’). This work has already started in the field of health care education, but we are still at the beginning of this journey. Evaluation efforts have started to use models and frameworks that emphasise programme factors other than outcomes such as context, process and the needs of stakeholders. Recent efforts have also started to explore the value of articulating a programme's theory, which helps to explain how a programme works. Our questions in the field of programme evaluation need to shift from questions of attribution to questions of contribution This is not a plea to do away with the Kirkpatrick model, or advocating that any toolbox should do away with a hammer. On the contrary, it is about building a better hammer and understanding when you need to use one in the first place. The Kirkpatrick model can be used effectively at the start of a programme's development to identify what programme outcomes are to be achieved and how the programme should be constructed and implemented to achieve those outcomes. However, to continue to rely solely on the Kirkpatrick model to render judgement on the merit or worth of our complex programming is doing a disservice to the value that these programmes actually have to their diverse stakeholders. In the spirit of the quote ‘it's not either/or but both/and’, what is important to the evolution of programme evaluation is not only building a better hammer, but building a better toolbox. The Kirkpatrick model can be used effectively at the start of a programme's development, but to rely on it solely to render judgement is doing a disservice to the value programmes have to their diverse stakeholders

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call