Abstract

A causality dilemma has hitherto existed in relation to low Bispectral Index (BIS) values and poor outcomes after general anesthesia in older patients. Does low BIS result in poor health or does poor health result in low BIS?1 Large observational studies have failed to answer this question because of their inherent inability to account for patients who are sensitive to anesthesia self-selecting into the deep anesthesia group,2–7 and a recent large randomized trial intending to recruit 970 patients was stopped after an interim analysis of 381 patients for futility (Table 1).8 Calls for evidence from randomized trials9,10 therefore remain unanswered.Table 1: Previous Studies of Anesthetic Depth and MortalityIn this issue of the Journal, Brown et al.11 report on a follow-up survival analysis of 114 patients originally enrolled in a randomized trial to assess postoperative delirium after hip fracture repair.12 These elderly patients (mean ± SD age 81.7 ± 7.2 years) received spinal anesthesia supplemented by either light (mean BIS 85.7 ± 11.3) or deep (49.9 ± 13.5) sedation using propofol and/or midazolam.11,12 Overall, 1-year mortality was similar in the light and deep groups (19.3% vs 29.8%; P = 0.21). However, when only sicker patients were considered (Charlson comorbidity index >4), light sedation was associated with lower 1-year mortality than deep sedation (22.2% vs 43.6%; P = 0.04).13 In this study, 2 reasons for an association between sedation depth and survival were investigated.11 First, the effect of postoperative delirium was considered, as patients in the light group were less prone to this complication than patients in the deep group (19% vs 40%; P = 0.02). Delirium is more likely with deep sedation/anesthesia12,14 and potentially may be a marker of anesthetic toxicity (either directly15 or via electroencephalographic burst suppression16). However, no interaction between delirium and sedation depth in mediating mortality was demonstrated. Second, the effect of arterial hypotension (defined as an intraoperative systolic blood pressure decrease >30% from preoperative values and/or systolic blood pressure <90 mm Hg) was explored, as hypotension was associated with increased mortality in recent studies.7 No significant difference in the duration of hypotension between the light and deep groups was demonstrated in all patients (9 ± 14 vs 13 ± 22 minutes; P = 0.2812) or those with Charlson comorbidity indices >4 (median [interquartile range] 0 [0–15] vs 5 [0–12] minutes; P = 0.7611). Appropriately Brown et al.11 advocated further research rather than speculating further on etiology or recommending an immediate change in practice. Our editorial will expand on the need for caution in immediately extrapolating the results of this interesting study to clinical practice. Our first note of caution relates to sample size. The original sample size calculation for this study was based on the assumption of postoperative delirium incidences of 12% and 36% in the light and deep groups, respectively (power = 86%; α = 0.05).12 For the follow-up survival analysis, 80% power was estimated to detect a hazard ratio of 0.58 for survival in lightly sedated compared with deeply sedated patients.11 Although these powers are commonly accepted in the medical literature, it is instructive to recall that such studies will miss 14% and 20% of “true” results, respectively. The combination of small sample size and low power may not only increase the likelihood of failing to demonstrate a true effect, but also may increase the risk of showing a spurious effect, due to mathematical inevitabilities or the co-occurrence of biases.17 Even in the presence of excellent study design, small trials with low power can produce unreliable findings due to low prior probabilities of finding true effects, low positive predictive values for claimed effects, and exaggerated estimates of effect size for true effects,17 the so-called “winner’s curse.”18 In the anesthesia literature, this curse is illustrated by spectacular risk reductions for myocardial infarction in early β-blocker trials19,20 that were not replicated by large trials21 or meta-analyses.22 Ioannidis18 urges caution in interpretation of large effects from early small trials and encourages larger trials in the discovery phase as a means to avoid being misled. A more intuitive way of looking at sample size is to look at the fragility of a study.23 Fragility is the number of patients who would need to have a different outcome to change the result and provides a useful measure of the robustness of a study. It is akin to the concept of reproducibility, which is the likelihood that an identical study would produce the same result if done again.24 In Brown et al.,11 in the subgroup of patients with a Charlson comorbidity index >4, 10 of 45 patients died within a year in the light group and 17 of 39 patients died within a year in the deep group. Just 1 more patient dying in the light group, or 1 less patient dying in the deep group, would lead to a nonsignificant result (χ2 test, P = 0.07). Such fragile results can only be regarded as hypothesis generating, or requiring confirmation, as correctly identified by the authors. According to traditional power calculations, for a 20% absolute difference in mortality as found in this study, with α = 0.05 and power = 0.8, a confirmatory study would need 197 patients. Such a study would require 7 more patients to die in the light group, or 9 fewer patients to die in the deep group, to change the significance of the result and thus would be more robust. Some problems with study design become worse in small low-powered trials.17 Small sample size increases the risk of imbalance at baseline, as evidenced in Brown et al.11 by a higher proportion of patients living independently in the light group than the deep group (74% vs 56%; P = 0.08).12 This measured imbalance is a signal for potential imbalance in unmeasured but important prognostic variables that may be alternative explanations for the result. Small sample size also decreases the precision of risk estimates, as evidenced by wide 95% confidence intervals (approaching 1.0 at their upper end) around the hazard ratios for survival (0.28–1.33, 0.19–0.97, and 0.12–0.94 for all patients and those with Charlson comorbidity indices of >4 and >6, respectively). Finally small sample size decreases the probability of demonstrating effects across all subgroup analyses (assumptions of proportionality for Cox hazard modeling were not supported for survival beyond 1 year). A small study may thus also fail to detect an important effect worthy of further investigation. Our second note of caution relates to the generalizability of these results. When conducting a randomized trial, it is prudent to recruit patients who are at high risk of the primary outcome and ensure wide separation in the intensity of the intervention if both groups are to receive it. In previous studies, long-term survival has varied markedly (5.5%–24.3% mortality2–6; Table 1), but no study has included patients with the risk profile of these elderly hip fracture patients (long-term mortality 45%). Furthermore, previous studies observed patients having general anesthesia (BIS <60), with or without neuraxial blockade.2–8 In the current study, all patients received spinal anesthesia, some in combination with general anesthesia (i.e., BIS near 50) and some without significant hypnotic administration (i.e., BIS around 85). Although a protective effect of neuraxial blockade is strongly supported,25 the combination of neuraxial blockade with general anesthesia has been associated with poorer outcomes than general anesthesia alone in a recent propensity score–adjusted post hoc analysis of Perioperative Ischemic Evaluation (POISE) study patients.26 These factors make generalization away from hip fracture patients under spinal anesthesia injudicious. Finally, Brown et al.’s patients11 were randomized to dramatically different BIS values.12 In previous studies, depth of sedation was all in the general anesthesia range (i.e., BIS <60) and the differences in anesthetic depth among patients was smaller.2–8 We are currently recruiting to a 6500-patient international randomized controlled trial of volatile-based general anesthesia titrated to a BIS of 50 or 35 (Australian and New Zealand Clinical Trial Registry number 12162000632897). Eligible patients are aged ≥60 years, have significant comorbidities, and present for surgery lasting more than 2 hours. Our pilot study demonstrated the feasibility of BIS-guided titration and maintenance of similar arterial blood pressures, as well as 10% 1-year mortality in the index population.27 We hope that this large trial will definitively answer the question of whether low BIS values are truly associated with poor outcomes in elderly patients. DISCLOSURES Name: Kate Leslie, MBBS, MD, M Epi, FANZCA. Contribution: This author helped prepare the manuscript. Attestation: Kate Leslie approved the final manuscript. Name: Timothy G. Short, MBChB, MD, FANZCA. Contribution: This author helped prepare the manuscript. Attestation: Timothy G. Short approved the final manuscript. This manuscript was handled by: Sorin J. Brull, MD, FCARCSI (Hon).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call