Abstract

Shortness of breath is a common presenting complaint in the emergency department (ED) with a wide differential diagnosis that includes acute heart failure (AHF), exacerbation of chronic obstructive pulmonary disease (COPD), pneumonia, and pulmonary embolism. The findings for these etiologies of dyspnea overlap, particularly in aging adults with significant comorbidities.1 Delays in the diagnosis and treatment of AHF worsen prognosis and increase healthcare costs.2-4 Emergency physicians play a key role in diagnosing AHF, assessing symptom severity, choosing initial management strategies, and determining disposition from the ED.5 Understanding the benefits and pitfalls of using history, physical examination, routine labs, x-ray imaging, and bedside sonography is essential. In this issue, Martindale et al. report a detailed systematic review of studies evaluating the accuracy of findings and tests that emergency physicians rely on to diagnose AHF in ED patients with dyspnea. They reviewed the literature on 1) history; 2) physical examination; 3) electrocardiogram (ECG); 4) ED echocardiogram; 5) chest x-ray; 6) lung ultrasound; 7) bioimpedance; and 8) natriuretic peptides, B-type natriuretic peptide (BNP) and N-terminal pro-BNP (NT-pro-BNP). We will comment on the results of their review with special focus on the use of BNP to diagnose AHF. The discussion by Martindale et al. of BNP is particularly noteworthy because they pooled patient-level BNP data from six studies (2,202 patients) and reported interval likelihood ratios. When the study population is defined by the patients' clinical presentation, it is difficult to evaluate symptoms and other elements of the patient's history as if they were laboratory or imaging tests. As an example, consider trying to evaluate the symptom of “dyspnea at rest” as a diagnostic test for AHF in studies of patients with dyspnea. In their Table 1, the authors report a pooled sensitivity for AHF of 54.6% and a pooled specificity of 49.6%. This means that dyspnea at rest occurred in 54.6% of patients with AHF and 50.4% (100% – 49.6%) of patients without AHF. From this we can conclude that about half of all patients with dyspnea say they have it at rest. Whether the dyspnea occurs at rest or only with exertion has little or nothing to do with its underlying cause. Many studies have reported poor reliability of physical examination findings, including findings on the respiratory examination.6 Rales on auscultation of the lungs were present in only 62.3% of patients with AHF and in 31.9% of patients without AHF. When interobserver agreement about the presence of a finding is poor, as it almost certainly is with rales on auscultation7, test accuracy is low. On the other hand, while individual examination findings may be unhelpful in diagnosing AHF, those first 2 minutes in the examination room are critical to the evaluation of the patient. In general, we are skeptical of intuitive decision-making, but the experienced clinician is often able to distinguish the severely from the moderately ill based on a brief in-person evaluation, often without even using a stethoscope.8, 9 The ECG is the single most important test for myocardial infarction (MI) and constitutes the gold standard for ST-elevation MI. If a patient is having an MI, he or she is unlikely to be included in a study of undifferentiated dyspnea. (Acute MI and unstable angina were exclusion criteria for many of the studies selected in this systematic review.) Even if the acute MI patient is included, the final classification for the cause of his or her dyspnea is more likely to be ischemia/infarction than AHF. In their Table 2, the authors report ST-elevation in only 5.2% of patients with AHF and 8.2% without AHF. As it is primarily a test for ischemia or arrhythmia, the ECG's suboptimal accuracy in diagnosing AHF is not surprising. Decreased ejection fraction (EF) on the ED echocardiogram had both sensitivity and specificity of about 80% (Table 5 in Martindale et al.), which means that this finding increases the odds of AHF by a factor of 4, and its absence reduces them by a factor of 1/4. Since decreased EF is part of the criterion standard diagnosis of AHF, the 80% accuracy of the ED echocardiogram may be lower than expected. However, echocardiograms are technically difficult for ED physicians and may not add much if the patient has a recent echocardiogram in the medical record. Many readers will be surprised by the modest discriminatory value reported for the chest x-ray and the high discriminatory value of lung ultrasound. The authors included eight studies of lung ultrasound covering almost 2,000 patients in their pooled analysis, although one study10 accounted for more than half of these patients. Lung ultrasound to diagnose AHF is relatively new and unfamiliar to many emergency physicians. These promising findings should prompt us to consider adding this test to our options for diagnosing AHF. The accuracy reported for segmental bioimpedance analysis (sensitivity = 88.4%, specificity = 91.7%) is based on a single study of 292 patients11 that was susceptible to spectrum bias that would increase specificity: the study excluded a substantial number of non-AHF conditions, such as acute coronary syndrome, pericardial effusion, pulmonary embolism, advanced cirrhosis, chronic renal failure, and nephrotic syndrome.12 Also, the sensitivity and specificity were calculated using a cutoff derived from the study's own bioimpedance results, which constitutes a mild form of overfitting.13 In this case, a BNP level could be very helpful. If it were less than 100 pg/mL, heart failure would be extremely unlikely ([likelihood ratio = ] 0.09). If it were elevated, the probability of heart failure is higher but not diagnostic. A BNP less than 100 pg/mL would allow the emergency physician to focus on treating the lung disease, but based on the pooled analysis of 9,143 patients (their Table 3), fewer than one-third of all dyspneic patients has a BNP that low, and a patient with COPD is even more unlikely to have a “normal” BNP, even in the absence of AHF. Whether a BNP higher than 100 pg/mL is diagnostic of heart failure depends on exactly how high it is. In the specific scenario described in the JAMA paper, a BNP of 1200 pg/mL would be “diagnostic” enough to prompt immediate treatment for AHF. It is a mistake to treat a continuous test such as BNP as dichotomous, either positive or negative. Instead, we need to divide the continuous range of BNP results into multiple intervals and associate each BNP interval result with an interval likelihood ratio.16 Multiplying the pretest odds of disease by the likelihood ratio gives the posttest odds (Table 1). If the test result has a likelihood ratio greater than 1, then the test result makes the disease more likely (increases the odds); if it has a likelihood ratio less than 1, it makes the disease less likely (decreases the odds); and if it has a likelihood ratio close to 1, it does not change the likelihood of disease. 1. Convert pretest probability of disease P to prior odds of disease: Prior Odds = P/(1 − P) Prior Odds = 0.33/(1 − 0.33) = 0.5 or 1:2 2. Calculate the likelihood ratio associated with the test result: LR(result) = P(result|disease)/P(result|no disease)a LR(1200 pg/mL) = LR(1000–1500 pg/L) ≈ 7 3. Calculate posterior odds given the test results: Posterior Odds = Prior Odds × LR(result) Posterior Odds = 1:2 × 7 = 7:2 = 3.5 4. Convert posterior odds to posterior probability: Posterior Probability = Posterior Odds/(1 + Posterior Odds) Posterior Probability = 3.5/(1+3.5) = 7/9 = 78% As with other continuous markers such as the white blood cell (WBC) count, D-dimer, and serum lactate, emergency physicians have had to learn to be more sophisticated in their interpretation of BNP. Many of us formed the initial impression that a level < 150 pg/mL virtually excludes AHF; a level between 150 and 400 provides little evidence against or in favor of AHF; a level 400 to 1000 makes AHF significantly more likely; and a level greater than 1000 virtually rules it in. This initial impression was confirmed when a cogent review by Schwam in 200417 reported interval likelihood ratios for these ranges. Now, Martindale et al. have pooled patient-level BNP data from six studies (2,202 patients) and reported interval likelihood ratios in their Table 4 that again generally confirm our clinical interpretation of BNP. BNP as a test for AHF can be sensibly divided into four ranges: <150 pg/mL (“rule out,” LR < 0.2); 150 to 400 pg/mL (indeterminate, LR ≈ 1); 400 to 1000 pg/mL (suggestive, LR = 2–5); and >1000 pg/mL (“rule in,” LR > 7). NT-pro-BNP has proven more difficult to interpret and of less discriminatory value, at least when considered independently of patient age. We find it difficult to understand why studies and systematic reviews of diagnostic test accuracy fail to report interval likelihood ratios for BNP. The authors of this review and Schwam previously have shown how much more useful this is. None of us would be satisfied if the clinical laboratory reported BNP as either “≤100 pg/mL” or “>100 pg/mL,” yet this is what reporting sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio at a cutoff of 100 pg/mL assumes.18 Similarly, we would not accept a peripheral WBC count report of “>10,000/μL” in a patient whom we are evaluating for cholecystitis,19 a synovial fluid WBC count report of “<100,000 /μL” in a patient whom we are evaluating for septic arthritis,20 or an erythrocyte sedimentation rate (ESR) of “abnormal” in a patient whom we are evaluating for temporal arteritis.21 In 1993, Simel et al.22 wrote that “the final decision to use multilevel versus dichotomous likelihood ratios should be for clinical rather than statistical reasons.” We believe that the clinical use of BNP justifies reporting interval likelihood ratios. In fact, the accuracy of many continuous tests, such as WBC counts, ESR, D-dimer, and serum lactate, should be reported using interval likelihood ratios. The clearest message from this systematic review and meta-analysis is that continuous tests like BNP are best evaluated using pooled patient-level data to calculate interval likelihood ratios. This review may also prompt us to gain proficiency at doing bedside echocardiograms and lung ultrasounds. One way to help clinicians use these results is to incorporate them into a clinical decision tool (CDT) application that integrates the results of multiple diagnostic tests. The “jury is still out” on these CDTs, so in the meantime, we will be using our clinical experience to integrate the data on AHF in the acutely dyspneic patient and attempt to provide more timely and appropriate care for those who are “breathing not properly.”

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call