Although randomized controlled studies are considered the “gold standard” to assess efficacy and safety, this approach is not always practical to ascertain long-term safety. To better define emerging safety signals, particularly for products already on the market, we mostly rely on observational studies. However, we are cognizant that all observational studies suffer from a critical deficiency: the design is not an experimental one. Because each patient's treatment is deliberately chosen rather than randomly assigned, the risk of selection bias is unavoidable. As a result, systematic differences in outcomes may not be due to the treatment itself. Although methods to adjust for identifiable differences are available, it is impossible to be certain that such adjustments are sufficient or whether they address all patients' relevant characteristics. Estimates of the magnitude of the treatment effect and on the generalizability of the findings are therefore imprecise. All these caveats apply to any observational study. Observational studies do not definitively answer a question, but rather generate hypotheses that need to be further explored with additional studies. Questions regarding the short- and long-term safety of GH have been gaining attention due to the expansion of GH indications and the use of increased dosages. In this issue of the JCEM, Carel et al. (1) report the results of a longitudinal follow-up study of subjects receiving GH where they ascertained its long-term effects on mortality. By 2009, after a mean follow-up of 17.3 yr, they were able to determine the vital status of about 95% of approximately 7000 subjects who received GH in France. Indications for GH varied, but 75% were labeled as idiopathic GH deficient (IGHD), 11.5% as idiopathic short stature, 8% as having a GH neurosecretory disorder, and 5.5% as small for gestational age. Ninety-three subjects died. Eighty-six deaths occurred after GH discontinuation, but six occurred while on active treatment. All-cause mortality was 33% higher than expected when compared with the general population. This increase was attributed to a higher than expected number of bone tumors, as well as subarachnoid and intracerebral hemorrhages. The increases were not associated with any other types of cancer. Doses in excess of 50 μg/kg·d were strongly associated with the increase in mortality. Significance was seen only after 15 yr of follow-up. Analyses at 5 and 10 yr did not show a significant increase in mortality compared with the general population. Shorter children at baseline and males were at increased risk, as well as those with more robust responses to provocative tests. The strengths of this study (1) include the balanced review and fair presentation of the information; the extent of the data, including diagnoses, dosages, time of exposure, and long-term follow-up; the attempts to ascertain in depth causes of death; and the use of internal and external controls. A major weakness of this study (1) is the lack of efficacy data. We do not know how these subjects responded to the treatment. Other limitations are related to the inability to identify and secure an appropriate short stature comparator, the lack of data on growth velocity before GH was administered, the absence of IGF-I levels before and during GH, the small number of events, and the significant amount of missing data, particularly in the ascertainment of cause of death in 22% of cases. The mean age at the time of censoring was less than 30 yr, and it is unknown whether the encountered trend will continue or disappear with passage of time. The number of females may be too small to detect any effect in this group. Many of these shortcomings were recognized by the authors, and attempts to address them were made. Each of these deficiencies, however, opens the door to alternative explanations. It is important to stress that the difficulties facing these authors are common in the field, and that investigators undertaking this sort of project are likely to encounter similar barriers.
Read full abstract