Abstract
Recent Editorials in this journal stressed the classicalparadigminclinicalepidemiologyofinsistingontesteretestevaluations for studies on diagnosis and prognosis [1] andspecifically prediction models [2]. Indeed, independentvalidationofpreviousresearchfindingsisanimportantscien-tific principle.Another recent debate was on the interpretation of thelack of external validation studies of published novel pre-diction models [3e5]. One issue is the role that validationshould have at the time of model development. Many re-searchers may be tempted to try to report some proof forexternal validity, that is, on discrimination and calibration,in independent samples with their publication that proposesa new prediction model. Major clinical journals currentlyseem to appreciate such reporting. Another issue is whetherexternal validation should be performed by different au-thors than those involved in the development of the predic-tion model [3,6]. We would like to comment on these andrelated key issues in the scientific basis of predictionmodeling.The recent review confirms that model developmentstudies are often relatively small for the complex chal-lenges posed by specifying the form of a prediction model(which predictors to include) and the estimation of predic-tor effects (overfit with standard estimation methods) [3].The median sample size was 445 subjects. The number ofevents is the limiting factor in this type of research andmay be far too low for reliable modeling [4]. In such smallsamples, internal validation is essential, and apparent per-formance estimates are severely optimistic (Fig. 1). Boot-strapping is the preferred approach for internal validationof prediction models [7e9]. A bootstrap procedure shouldinclude all modeling steps for an honest assessment ofmodel performance [10]. Specifically, any model selectionsteps, such as variable selection, need to be repeated perbootstrap sample if used.We recently confirmed that a split sample approach with50% held out leads to models with a suboptimal perfor-mance, that is, models with unstable and on average thesame performance as obtained with half the sample size[11]. We hence strongly advise against random split sampleapproaches in small development samples. Split sample ap-proaches can be used in very large samples, but again, weadvise against this practice because overfitting is no issueif sample size is so large that a split sample procedurecan be performed. Split sample approaches only work whennot needed.More relevant are attempts to obtain impressions ofexternal validity: do model predictions hold true indifferent settings, for example, in subjects from other cen-ters, or subjects seen more recently? Here, a nonrandomsplit can often be made in the development sample, forexample, by year of diagnosis. For example, we might vali-date a model on the most recent one-third of the sampleheld out from model development. Because the split is intime, this would qualify as a temporal external validation[6]. The disadvantages of a random split sample approachunfortunately equally hold here: a poorer model is devel-oped (on smaller sample size than the full developmentsample), and the validation findings are unstable (basedon a small sample size) [9].We make two propositions for validation at the time ofprediction model development (Fig. 2). First, we recom-mend an ‘‘internaleexternal’’ validation procedure. In thecontext of individual patient data meta-analysis (IPD-MA), internaleexternal cross-validation has been used toshow external validity of a prediction model [12,13].Inan MA context, the natural unit for splitting is by study.Every study is left out once, for validation of a model basedon the remaining studies. The final model is based on thepooled data set, which we label an ‘‘internallyeexternally
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.