Abstract

Despite recent successes in the management of heart failure, many patients continue to have debilitating symptoms and prognosis remains poor. Indeed, effective treatment, by increasing longevity, has probably increased the number of patients with heart failure and persisting symptoms. New agents are required either to replace or to add to existing treatment. However, the development of new agents for heart failure is becoming increasingly complex precisely because partially effective treatment now exists. Therapeutic developments in heart failure have, in some ways, become a victim of their own success. Until the 1990s, no treatment for heart failure had been required to show an effect on mortality. The demonstration that many agents for heart failure could increase mortality and that some could decrease it led regulatory authorities to insist on enough information to exclude a substantial adverse effect on mortality. At the same time, the clinical community began to consider mortality to be one of the expected benefits of effective treatment, a view that was encouraged by pharmaceutical companies with successful agents. Saving lives was emotive and justified the expense of the therapy in question and the staff and investigations required to deliver it. The rise in importance of mortality as an end-point for clinical trials has had several unfortunate consequences. Trials that have mortality as their primary end-point need to be large. Large trials are expensive financially and in terms of human resources. This inhibits the development of new interventions since the commercial risk is perceived to be high. Large trials often demand simple protocols that gather little information on symptoms and the lack of importance placed on symptoms in these trials has probably had an effect on clinical practice. Consequently, the symptomatic needs of patients may have received less attention than they should from clinicians. In turn, this has led to relative neglect of symptoms in trials of new agents. Although all-cause mortality will always remain important as a measure of the safety of an intervention, those involved in the development of new strategies for the management of heart failure would like to identify ways of assessing the effectiveness of therapy in smaller number of patients. This may increase the speed of new developments and reduce the risk to those patients, investigators, companies, charities and government agencies that make the investments. Two broad strategies have been used in an attempt to conduct meaningful studies in smaller numbers of patients. The first strategy was to try to develop surrogate measures for the efficacy of agents. Exercise testing, arrhythmia monitoring and measurement of ejection fraction or of ventricular remodelling have all been advocated 1. Some surrogate measures, such as exercise testing and arrhythmia monitoring have been largely discredited. Other surrogate measures are not persuasive, either because they fail to account for the complexity of the pathologies underlying the progression of heart failure or because they are simply irrelevant to the patients own experience. Patients don't complain about their left ventricular end-systolic volume! The second strategy is to look at measures other than mortality that are relevant to patients and their care. This has the potential advantage of increasing simultaneously the feasibility and clinical relevance of the study. The most obvious outcome measure is symptoms. Symptoms are highly and directly relevant to patients. Whereas a patient only dies once, they may suffer symptoms every day of their life. An effective treatment in properly selected patients might be able to show an improvement in symptoms in a few tens of patients. There are a number of problems with using symptoms alone as an outcome in heart failure. Patients who die cannot be assessed symptomatically, although they could be judged to be in the worst possible state. Provided enough patients are studied to overcome any ‘noise’ introduced by death, this does not pose a major problem. Assessing symptoms over the short-term in stable patients with severe heart failure is fairly straightforward but is much more complex in any other scenario. Patients who do not have symptoms cannot improve; only prevention of worsening symptoms can be studied. This entails the study of larger numbers of patients over longer periods since deterioration in an individual patient is relatively unpredictable. Patients who have very severe symptoms are usually not stable. These patients often rapidly deteriorate further and die but many improve dramatically. Any attempt to introduce stability criteria in these patients is probably doomed to failure since patients are likely to stabilise by dying or by getting better. Studying the speed of improvement, its magnitude and the frequency of relapse is clinically relevant, feasible and logical. Although the natural history of heart failure is generally downhill, the rate of deterioration is very variable and the course usually fluctuating. Many patients will have severe symptoms relieved by the introduction of more effective therapy; diuretics and vasodilators may change symptom status dramatically within hours. It then becomes unclear whether a patient's symptoms have changed because of the concomitant therapy or the intervention being studied. One way of dealing with this is to regard increases in therapy as evidence of worsening. However, this is not entirely satisfactory because treatment might be increased because of factors other than symptoms, such as to improve control of ventricular rate or blood pressure. Also, worsening renal function and hypotension may be manifestations of worsening heart failure that demand a reduction in therapy. There is a reluctance to accept symptoms as objective evidence of benefit by some. However, their over-riding clinical relevance should quell such concerns. On the other hand, it is not clear how symptoms should be assessed. The most common measure of patients’ symptoms is NYHA functional class, which appears an effective measure in predicting outcome and responsive to effective therapy 2. However, it is not clear how investigators have applied NYHA classification. NYHA class should simply be a patients response to a question about their ability to exert themselves, but it is possible that many clinicians use NYHA class as a way of expressing how well they think the patient is doing. This may be its strength but could also be seen as a weakness. NYHA class describes how a patient feels at one moment. Others have opted to ask patients whether they feel that their symptoms have changed over time. Asking about a change in health state implies that the patient can recall accurately how they used to feel at a certain time in the past. A patient who used to feel well, who developed an acute exacerbation of heart failure who now feels better but is clearly not as well as at baseline may report they have improved when in fact they have deteriorated overall. For this reason, it is probably better to assess symptoms primarily in absolute terms although little extra effort is required to assess the patients opinion about changes in symptoms in addition. Another problem caused by the fluctuating course of heart failure is the effect of chance when symptoms are assessed only at the beginning and end of therapy. If a patient develops an intercurrent illness unrelated to their cardiac problem, for instance an infection, just before the second assessment they may report worsening symptoms even if they had felt much improved during most of the follow-up. It seems strange that almost all trials report only the symptoms at the start and end of treatment, throwing away the vast majority of the patients experience. There is of course no reason why symptoms should not be assessed serially and the average response reported. In this way, the effects of a chance event leading to a sudden temporary change in symptoms are limited and a much more accurate representation of the patients overall status is given. Another major problem when assessing symptoms is the placebo response. Many patients respond promptly to a placebo. This probably reflects a variety of factors. Investigators may be keen to recruit patients and may report symptoms to be more severe than they actually are at baseline but report them accurately after randomisation (often termed NYHA creep). Other patients may have a fluctuating severity of symptoms, which were destined to improve anyway. Other patients benefit from the extra care and attention they get from participating in a clinical trial while others have a genuine ‘mind over matter’ placebo-response. It is important to recognise and develop strategies to deal with the ‘placebo’ response in clinical trials. Simple approaches, such as a single-blind placebo run-in will reduce some of the above problems but will not eliminate them. More intelligent solutions are required to isolate the effects of the treatment from the investigator- and patient-dependent placebo-response. Symptoms may be modified by adjusting therapy. It is not practically possible to prevent changes in therapy in patients with side effects to their treatment or who deteriorate markedly. Increased therapy may result in the patients being symptomatically as well as or better than before the start of the study, although such events predict a poor outcome and are often attended by a pronounced deterioration in underlying cardiac function. Thus, changes in therapy for heart failure should be taken into account during follow-up. However, patients with heart failure are commonly on complex therapeutic regimens and will often have adjustments in therapy for reasons other than symptomatic deterioration. In many respects, evaluation of changes in therapy is the most difficult aspect of the composite clinical outcome. Hospitalisation is a frequent, usually important, experience for the patient, is an integral part of the morbidity of the disease, predicts a high mortality and is a major contributor to health-care costs. For all these reasons, it is important to consider hospitalisation as part of an outcome measure in patients with heart failure. Most hospitalisations in patients with heart failure are for conditions other than worsening heart failure. However, heart failure contributes a much larger proportion of the days in hospital and the severity of heart failure may often determine the length of stay for non-heart failure admissions 3. Cause-specific events, whether mortality or hospitalisation, are not a robust measure of treatment effect 4. Decisions about whether heart failure is the cause of something or not are often arbitrary. Moreover, the side effects of treatment may cause an increase in non-heart failure hospitalisation that may offset any benefit from a reduction in heart failure hospitalisation. Indeed, if the patient is hospitalised for a non-heart failure event then they are not at risk of being hospitalised for heart failure. Cause-specific events may be used to explain the natural history of disease and how to modify it (e.g. if stroke contributed to 20% of days in hospital amongst patients with heart failure, a new treatment to prevent stroke could be worthwhile). One potential criticism of hospitalisation as an outcome in international studies is large country-to-country variation in the threshold for hospitalisation and the duration of hospital stay. As long as the randomisation process is robust at a national level, this does not matter as a measure of treatment efficacy although it could have a bearing on health economic evaluation at the national level. Any attempt to assess the benefits on symptoms or non-fatal events must consider mortality. Patients with more severe symptoms of heart failure are also more likely to die. Any treatment that reduces mortality is likely to have a greater effect amongst highly symptomatic patients and therefore, the benefit on symptoms will be underestimated because it keeps such patients alive. Conversely, if the agent (or placebo) is associated with an increase in mortality there may appear to be an improvement in symptoms, because the sickest patients have died selectively. It is also important to assess mortality because of the diversity of ways that a treatment can reduce non-fatal events 5. A treatment can genuinely reduce non-fatal events. Alternatively, a treatment may conceal events, which is not a major problem provided concealment does not have an adverse effect on subsequent morbidity and mortality. A third possibility is that treatment may increase mortality, especially sudden deaths, thereby preventing non-fatal events. This may happen, either because sudden death shortens the time-span in which non-fatal events can occur or because treatment converts a non-fatal event into a fatal one. It is important that all-cause mortality is considered because it is often difficult to attribute the cause of death accurately to one particular cause and because death precludes any further assessment of symptoms or non-fatal events 4. Composite outcome measures often including death, hospitalisation for heart failure and other measures of worsening heart failure have some merit. However, the traditional use of time to first event analyses or assigning an end-of-study outcome is questionable for any event other than mortality 6. A single adverse, non-fatal episode early in the course of the study could result in the patient being assigned to the group with an adverse outcome even if they improved dramatically later in the study. For treatments that might provoke early non-fatal adverse events, such as beta-blockers or cardiac resynchronisation therapy, but produce long-term benefits, the end-pointing method may be critically important. Days alive and out of hospital has become a popular end-point in recent clinical trials. It is probably a good measure of the benefits of an agent on morbidity and health economic outcomes. However, in studies with a substantial mortality it is likely that death rather than hospitalisation will be the major contributor to this outcome. This outcome is also statistically complex to handle due to the fact that outcome is often skewed because a substantial minority of patients have no events during follow-up. Patients want to have their symptoms improved, morbidity reduced and, usually, to live longer. They would generally prefer this to happen with the smallest amount of treatment and the fewest side effects. All are important to patients and a clinically relevant outcome measure should attempt to reflect their needs. Of course, the patient, and therefore, their doctor, may want to know whether the treatment only has an effect on symptoms or prognosis (although effective treatments will often do both) so that they can make an informed decision, should they wish, about their therapy. One way of assessing this is to produce a clinical scoring system including symptoms, change in therapy, hospitalisation and death that describes the patient's journey during the course of a clinical trial. Experience with diaries to record symptoms suggests that most patients will not make reliable daily entries for more than a few weeks. Also, when such information is collected the logistics of data entry can be enormous. Patients could be provided with an electronic counter, which could surmount such problems, but feasibility trials need to be conducted. An alternative pragmatic solution, which is increasingly used in clinical trials, is the serial measure of symptoms at practical intervals. This may be hourly for acute heart failure, monthly for the conventional 6-month study or longer intervals for studies of greater duration. Symptoms over the intervening period could be assumed the average of measurements at the beginning and end of each measurement period. Serial measurements describe the patient-journey, guard against the variability inherent in single before-and-after measurement and distinguish between periods of temporary and permanent deterioration. Patients’ symptoms can be given a score which may be arbitrary (e.g. asymptomatic-100%; symptoms on minimal exertion-50%; severely symptomatic at rest-0%), derived from published data 7 or from new scoring systems that still require validation. Symptoms should be assessed primarily on an absolute scale rather than as a change from baseline, to avoid the problems of inaccurate patient/investigator recall. For practical purposes, changes in therapy should be recorded at the same time as symptoms. It is very difficult indeed to track every day-to-day change in a complex medical regimen. However, major permanent changes in treatment will not be missed by serial assessment. It may be wise to focus on major medications directed specifically at episodes of worsening such as diuretic dose, a powerful marker of prognosis 8, and intravenous inotropic and vasodilator therapy. Other treatments, including ACE inhibitors, beta-blockers, spironolactone and digoxin may be given or adjusted for reasons other than clinical deterioration including hypertension, atrial fibrillation or simply for greater prognostic effect. Collecting adequate pharmacological data repeatedly, even if focussed on a few key agents, can be difficult. However, as long as treatment amongst survivors is known at the beginning and end of the study little will be lost. Changes in treatment can be added to the composite outcome score by adjusting the symptom score (see Table 1). Death can be given an arbitrary score of zero for the composite outcome and is an irreversible state. Hospitalisation could also, arbitrarily, be given a score of zero, but is of course a potentially reversible state. Duration of admission, even when not primarily for heart failure, often has more to do with the severity of heart failure rather than the reason for hospitalisation and therefore, is more relevant an outcome than whether or not the patient was admitted. Other events that may lead to prolonged admission, such as disabling stroke, are an important and common morbidity in patients with heart failure and should be included as a measure of the effect of therapy. Using the above techniques, an outcome of symptom-adjusted days alive and out of hospital can be obtained that reflects the patient-journey. If a quality of life score is measured then quality adjusted days alive can also be measured (essentially a QALY—quality adjusted life year), potentially the most important health outcome measure. Initial experience with these composite outcome scores suggest that they are more or less normally distributed and therefore parametric statistics can be used for power calculations and to assess the effects of treatment. Parametric statistics are much more powerful than traditional methods such as Kaplan–Meier survival/event curves or simple rank assignment (better, no change or worse), methods which also have the pitfall of being dependent on only one event rather than the overall patient experience. It is important that outcome is measured over a finite period so that all patients have an equal period of exposure to the risk of events and can attain the same potential maximum score. However, the length of time chosen can be varied depending on the type of problem being treated and the goals of therapy. For patients with acute heart failure, it might be reasonable to assess outcome over 24 h or 10 days. For moderate to severe heart failure an outcome over 100–200 days should be enough to assess symptomatic response. The approach can be used to assess mild heart failure, when treatment is used to delay progression rather than improve well-being, if the evaluation is conducted over a long period (e.g. 500 days). One potential problem with this composite measure is that power calculations indicate that very small trials may be enough to show important differences in outcome. The issues of credibility and safety then arise. A study, although positive for the primary composite, may be considered too small to be clinically credible. This can be dealt with either by designing the study to detect or exclude very small differences (near equivalence), by increasing power beyond the conventional 80–90% or by additionally powering the study for some of the components of the global composite endpoint. Studies with such a composite outcome measure can be used as a proof of concept to eliminate rapidly those therapies that are unlikely to be helpful while identifying those that are most likely to succeed. However, since such studies are designed to investigate the effects of treatment in relatively small numbers of patients over relatively brief periods there will not be sufficient patient-exposure to assess the safety of therapy properly. Whenever safety is an important issue, as it usually is, large outcome studies directed predominantly at mortality will be required. However, should two studies show a clinically relevant benefit on the proposed composite outcome measure, with each component of the composite behaving in a clinically coherent fashion, and the mortality study show safety then this is likely to be sufficient proof for clinicians that the treatment should be used. The proposed clinical composite measure is currently being assessed in a number of clinical trials. Some examples of individual patient outcomes in a study with 200 days of follow-up are shown in Table 1. Patient 1 is an example of a patient doing well apart from a hospitalisation for a chest infection complicated by a temporary exacerbation of heart failure; the patient was discharged on the same medication as on admission. Patient 2 is an example of a patient with progressive worsening of heart failure who subsequently died. Patient 3 is an example of a stable patient who died suddenly. Patient 4 is an example of a patient with progressive worsening of heart failure and recurrent hospitalisation that survived the trial. Patient 5 is an example of a patient with severe but stable symptoms who did not improve on therapy. Clinical trials should soon tell us more about how to refine this sort of measure that will allow treatments for heart failure to be assessed more rapidly and reliably with fewer resources.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call