Evidence-based medicine involves the deliberate integration of clinical research into therapeutic decision making.1 A prospective, randomized control trial (PRCT) is assumed to equally distribute unknown confounding variables and only manipulate the “treatment variable.” In medical PRCTs, this usually occurs in a well-defined population to determine efficacy; that is, does the intervention work on the participants in the study (internal validity)? In most medical (drug) studies, the greater challenge is determining whether the results can be extended to patients not in the study, so called external validity. Surgical or interventional studies face the same generalizabilty or external validity issues; however, one of the greatest challenges in surgical trials is patient recruitment, and the establishment of a valid study population to ensure internal validity. The recent history of medicine has been punctuated by PRCTs, which have established, reinforced, and challenged traditional clinical beliefs.2 One example of the astounding effect of level I trials is the recent literature on percutaneous vertebroplasty (PVP) compared with sham procedures3,4 and to conservative treatment5 for treatment of osteoporotic vertebral compression fractures. Many clinicians experience significant cognitive dissonance6 between the astounding early clinical improvement of many patients and the popular media perception of the results of these trials. An excellent review in this issue of the journal effectively dissects 2 of these studies and highlights strengths and weaknesses.5 However, the impact of these studies and others really distills down to the difficulties with establishing both internal and external validity. The major challenge of PRCTs is the requirement of equipoise by both patients and physicians regarding 2 apparently equally effective treatments. Clinicians may be biased to recommend direct interventions to some patients and only enroll patients in whom there is less severe intensity of symptoms, although this is controlled for once the study begins, the bias can be still appear through the inclusion-exclusion criteria developed by the clinicians. It is often the inclusion-exclusion measures that are adjusted based on expected recruitment. Patient consent to participate in interventional studies is by far the greatest challenge as systematic differences evolve between patients willing to participate in randomization. There may be significant differences in risk taking, expectations, and perseverance in people who are willing to relegate their treatment to chance versus patients who refused to participate. This volunteer bias would be akin to selection bias in observational studies. Most interventional studies have roughly 33% enrollment of those patients eligible. Patient preference is a big part of evidence-based medicine, and perhaps, the reason for the observational study advocates touting it as the best methodology for ensuring external validity. This dilemma was addressed in the Spine Patient Outcomes Research Trial by including an observational cohort to capture patients who were not enrolled in the prospective, randomized study.7,8 PRCTs are expected to have appropriate and accurate long term follow-up. However, in some situations long term follow-up may be less relevant than early functional results. For example, in the orthopedic literature, the natural history of most femur fractures is healing by 6 to 12 months regardless of treatment.9,10 The goal of internal fixation is early mobilization and pain control to avoid the sequelae of prolonged immobilization, possibly at the expense of soft tissue stripping and healing. Similarly, because the vertebral bodies have an excellent blood supply and soft tissue envelope, it is not surprising that the natural history of vertebral compression fractures is healing by 6 to 12 months. Is it fair to judge the long-term outcome of vertebroplasty compared with conservative treatment? Would anyone for go internal fixation of a femur fracture because of the equivocal long-term fracture healing? The real question is whether the “internal fixation” facilitates substantial early pain improvement and mobilization. Indeed, early pain relief after PVP has been observed in all of the PVP trials.11 An important overlooked result in the Rousing et al study was the significant (1.3 point, P < 0.02) improvement in Barthel functional score in the PVP group over the conservative group at 12 months. This functional improvement may be a reflection of earlier pain control and mobilization compared with conservatively treated patients. In practicing evidence-based medicine, we as clinicians and researchers must be wary of assigning truths to PRCTs, even if published in high impact journals. The so-called pinnacle of the research design hierarchy is not immune to methodological limitations often camouflaged by the unique and coveted PRCT design for evaluation of a surgical intervention. Common sense, continued investigation, and an appreciation of the principles around internal and external validity of high-level evidence studies will hopefully guide our practices in the future of spinal care.