It has become difficult to open an anesthesiology journal without seeing an article about anesthetic neurotoxicity. This issue of Anesthesia & Analgesia is no exception. Most of the work has been done in animals and suggests that harm can come to the developing brain when it is exposed to a broad array of commonly used anesthetic and sedative agents. The critical question, of course, is, does it happen in children? DiMaggio et al., in this issue of the Journal,1 approach this by using an administrative database of siblings to determine whether anesthesia and surgery in the first 3 years of life are associated with subsequent developmental or behavioral disorders. Their study shows that children exposed to anesthesia and surgery have a much greater likelihood of being diagnosed with a developmental or behavioral disorder than do unexposed children, but that within a matched twin pair, in which one sibling was exposed to anesthesia and surgery and the other was not, there was no greater risk of an adverse outcome. This clearly adds new information to the literature. But can this answer the question, “are anesthetics neurotoxic in children?” The manifestations of anesthetic neurotoxicity in children, if it exists, are nebulous. Unlike fetal alcohol syndrome, in which one sees a distinctive pattern of altered development,2 any adverse effect of anesthetic exposure on young children must be small, or it would be easy to demonstrate and would have been suspected long ago. We emphasize that by “small” we do not mean unimportant. The brain is an exquisitely fine-tuned organ, and even minimal damage to the development of neural circuits might have major functional ramifications. But the fact remains that although the animal studies on anesthetic neurotoxicity are compelling,3–5 the task of translating this finding to infants is huge because we do not know how and when the putative toxicity will manifest itself in humans. Then there is the formidable problem of distinguishing anesthetic effects from the impact of surgery or from the underlying pathophysiology that led the child to need surgery in the first place. Because anesthesia and surgery are inseparable, this will not be easy. This issue will not be resolved conclusively by any one study. Rather, many experiments based on different approaches will be needed. Officials of the U.S. Food and Drug Administration spoke to this recently when they wrote, “Generating definitive data about the effects of anesthetics on the developing brain will most likely take numerous animal and human studies spanning many years. Planning, conducting, and interpreting these studies will pose enormous challenges to the medical and scientific community. It seems unlikely that any single individual or organization will be able to muster the resources to take on this project.”6 So, what needs to be done? The “gold standard” would be a prospective, randomized controlled trial (RCT), perhaps of regional versus general anesthesia in infants. At least one such trial is enrolling patients (the GAS Study).7 However, as noted in an editorial in Anesthesiology,8 our incomplete knowledge makes designing an RCT very difficult. What age children should be enrolled (i.e., what is the vulnerable period in humans)? What kind of surgery should be included (i.e., do certain diseases/disorders predispose to anesthetic neurotoxicity via some unknown mechanism, or are children with certain disorders predisposed to problems even in the absence of anesthesia)? Does it matter what general anesthetic agents are used (volatile agents versus IV agents; γ-aminobutyric acid agonists versus N-methyl-D-aspartate antagonists)? Will the use of sedation in conjunction with regional anesthesia confound the study? Does exposure time/surgical duration matter? And perhaps more important, what is a meaningful outcome measure and when do we measure it? Do we seek developmental deficits in preschool; learning deficits in elementary school; social deficits in adolescence; or might it involve the loss of cognitive reserve and early dementia in old age? Does a short-term interim measure adequately predict long-term outcomes?9 Does an abnormality on a single neurocognitive test given at 3 years of age mean that the child will not get into the college of his or her choice? The information needed to answer these methodological questions can be gathered either from multiple RCTs (which, given the design difficulties, requires a kind of trial-and-error approach) or can be gained more rapidly and inexpensively via epidemiologic studies. Ideally, these would be based on well-designed prospective data-gathering efforts, but this approach shares many of the problems of an RCT (particularly the need for a very long postsurgical follow-up time). An alternative is to rely on analysis of existing data. This is the approach taken by DiMaggio et al.1 The authors mined a Medicaid database and found over 600,000 children born in the state of New York between 1999 and 2005. From this large group, they identified 5824 sibling pairs. After excluding children having minor surgical procedures (primarily circumcisions), children with preexisting disorders potentially associated with developmental difficulties, and a few that underwent high-risk surgeries, they were left with 304 children who had undergone surgery prior to age 3. These children were compared with the 10,146 children in whom no surgical procedures could be identified. Of the 304 exposed children, 24.7% carried a diagnosis of a developmental or behavioral disorder versus 8.8% of the unexposed children. Even after adjusting for other perinatal disorders that may have influenced outcome, the hazard ratio for a developmental or behavioral diagnosis associated with anesthesia and surgery was 1.7. However, a closer look yields different conclusions, which are the most important findings in the study. First, the difference disappeared when the analysis was restricted to children who had undergone only a single procedure (more about this later). More critically, when the analysis was confined to 138 matched twin pairs in which one child had undergone surgery while the other had not, there was no greater risk of the former being diagnosed later with a developmental or behavioral disorder. This is arguably the best-controlled comparison in the paper, because both siblings in the pair presumably share many of the same family, environmental, and socioeconomic factors. These seemingly paradoxical results within the same study point out both the advantages and limitations of the approach used by DiMaggio et al. A retrospective database design allows researchers to study large populations in which data have been gathered for long periods of time. It allows the investigator to study specific subpopulations with sufficient power to permit the detection of relatively small effects. The study populations are more diverse, and therefore represent a better cross-section of patients, society, and medical care than found in tightly controlled RCTs. Moreover, the results of such analyses typically have substantial validity,10 and well-designed observational studies tend to accurately estimate the magnitude of the effects of treatment in comparison with RCTs.11 On the other hand, there are challenges inherent in using existing databases. One is the reliability and validity of the data, which were typically not originally collected with a specific research question in mind. For instance, as acknowledged by DiMaggio et al., the New York Medicaid database has no information about the type of anesthesia (general versus regional), specific anesthetic medications used, or how much and how long they were given. Much of the information is entered by nonmedical personnel and rarely by individuals directly involved in either patient care or with the study in question. In addition, assignment of a diagnostic code number in the Medicaid Management Information Systems workflow is a several-stage process. As such, there is room for error. It is also not clear that coders use rigid definitions during data entry, and vague diagnoses such as “nonspecific delay” could overestimate the number of children with developmental problems. Medicaid databases may self-select populations that are already more vulnerable to developmental problems.12 The rates of developmental and behavioral disorders in the population studied by DiMaggio et al. are much higher than in the general population.13 In fact, the extraordinarily high incidence of problems in the surgical group suggests strongly that factors other than “neurotoxicity” are present. Another challenge inherent in studies using administrative databases is the definition of a comparison group. Ideally, the comparison group would be identical in all ways to the study group except that they never received an anesthetic (or any other neurotoxic agent at any time). DiMaggio et al. made an effort to create such a group, the matched sibling pairs. The fact that there was no difference in risk for diagnosis of an adverse outcome between the exposed and nonexposed matched sibling pairs challenges the results in the larger cohort and supports the position that surgery and anesthesia before age 3 are not associated with neurodevelopmental abnormalities. Others have used existing datasets to explore this same issue, with equally divergent results. Wilder et al.14 examined school and medical records of members of a birth cohort in Olmstead County, MN, who underwent any type of surgery or diagnostic procedure that required general anesthesia prior to their 4th birthday and determined the risk of learning disability after exposure (adjusting for postgestational age, sex, and birth weight). They reported a relationship between surgery and developmental difficulties, as judged by school test scores, but only in children who had undergone multiple surgeries. Another study that looked at children who underwent a urological procedure before 24 months of age did not find significantly more behavioral disturbances, as defined subjectively by the parents, than in children who had the procedure at later ages.15 A study by Bartels et al. using Dutch twins found no association between anesthetic exposure in children less than 3 years of age and subsequent educational achievement,16 leading them to conclude that “association between early exposure to anesthetic agents and the subsequent development of learning disabilities is potentially related to comorbidity rather than anesthetic exposure per se.”16 So what can we conclude? To date, the best-controlled studies (including the matched-pair comparisons of DiMaggio et al.) have failed to demonstrate that anesthesia has a negative neurodevelopmental impact on infants. What they and others have shown is that the more operations a child requires, the more likely they are to have problems. Is this evidence of causative anesthetic neurotoxicity? No, this is a clear example of “association does not prove causation.” It is a bit akin to the ice cream dealer at a busy lake who concludes that ice cream causes boating accidents because his sales increase in parallel with accidents (when, in reality, his sales simply parallel the number of boats on the lake). Children who require multiple surgeries are far more likely to have serious underlying medical disorders than are other children, disorders that can impact development. It is also plausible that the sociologic baggage and burdens, for both patient and family, which are attendant upon recurrent interactions with the medical care delivery process, are the actual culprits. The second reason why multiple procedures are problematic is a verification bias: children who see physicians frequently are more likely to receive additional diagnoses than are other children, some of which may be interpreted or coded as related to development or behavior.17 To reiterate, we believe that evidence is most consistent with the premise that “anesthesia per se,” given to an otherwise healthy child who needs only a “routine” surgical procedure, is not neurotoxic. This is extremely important because the preponderance of children who need anesthesia and surgery require only a single procedure: an inguinal hernia repair, orchiopexy, pyloromyotomy, etc. The corollary is that the data do not justify the sometimes alarmist rhetoric, in both the medical and lay press, surrounding this issue and that a “toning down” of the rhetoric is in order. The continued, repetitive discussion in the clinical literature of what is, to date, a laboratory phenomenon, needs to be limited, particularly when it includes suggestions about changing medical, surgical, or anesthetic practice. We are already encountering parents who fear for the well-being of their children and are asking us to avoid general anesthesia, despite a total lack of evidence that regional anesthesia is “safer.” We have no doubt (although without evidence) that there are parents who are declining surgery entirely because of such fears. This is unfortunate. Nothing to date justifies changes in medical, surgical, or anesthetic practice in infants. Nor are laboratory data, even those obtained in primates, sufficient cause for modifying how we treat children, in large measure because we have no assurance that the “cure” (one drug or technique over another) will not be worse than the “disease.” In an editorial over 30 years ago, Eger noted that “my father warned me that I could not disprove the existence of dragons.”18 Anesthetic neurotoxicity in children has become a modern day dragon in medicine. We may be able, with rigorous, appropriately designed studies, to prove that neurotoxicity does exist, and we strongly encourage and support efforts in the laboratory, the computer (database), and the clinic to that end. The new era of electronic medical records (including electronic anesthesia records) may provide a better platform for conducting large epidemiological studies.19 However, it is also important that we exercise restraint and caution in making public announcements regarding our findings. We reiterate DiMaggio et al.'s advice to await results from more rigorous studies that enable more substantial evidence before we consider changing practice. But if, in the absence of conclusive findings from well-controlled studies, we manage to convince parents, clinicians, and regulators alike that “there may be a problem,” we will have serious difficulty later convincing those same people there is not. Dragons are very good at hiding from view, and they are very hard to kill. DISCLOSURES Name: Joss Thomas, MPH. Contribution: Manuscript preparation. Conflict of Interest: Dr. Thomas has no conflict of interest to report. Name: Gregory Crosby, MD. Contribution: Manuscript preparation. Conflict of Interest: Dr. Crosby has no conflict of interest to report. Name: John C. Drummond, MD, FRCPC. Contribution: Manuscript preparation. Conflict of Interest: Dr. Drummond has received lecture honoraria from Hospira. Name: Michael Todd, MD. Contribution: Manuscript preparation. Conflict of Interest: Dr. Todd has no conflict of interest to report.