Abstract

Abstract Background: Ki67 has been suggested as a marker for diagnosis of luminal A and B breast carcinomas. Interestingly, on one hand a multitude of studies have described significant results for Ki67 as a prognostic marker, while on the other hand the analytical validation and standardization of this marker has been a challenge. The best parameter for Ki67 interobserver performance is the interclass correlation coefficient (ICC). ICC values between 0.59 and 0.92 have been reported. Recently a minimum ICC of 0.8 has been suggested as a goal for the international ring trial and as a prerequisite for introduction of Ki67 into clinical practice. However, this suggested ICC is not derived from analysis of data, and the amount of pathologist variance that is allowed for meaningful biomarker results is still not defined. Methods: This study is based on a total of 9069 tumor samples from three large clinical cohorts (IBCSG VIII+IX, BIG1-98, and GeparTrio). In a systematic modeling approach, we introduced different amounts of variance to previously generated central pathology Ki67 datasets by simulation of a total of 1800 different pathologist evaluations for each study cohort. These evaluations were grouped into groups with defined ICCs, ranging from very good concordance (ICC=0.9) to extremely poor concordance (ICC=0.1). For each of the simulated pathologist evaluations, all possible Ki67 cutoffs were systematically evaluated using the web-based software Cutoff Finder (http://molpath.charite.de/cutoff/). As endpoints, we used DFS for all three study cohorts as well as pCR for the neoadjuvant cohort. Results: For the neoadjuvant GeparTrio study, the different groups with ICCs of 0.8, 0.6 and 0.4 showed a very similar performance resulting in significant analyses for prediction of pCR across a wide range of cutoffs. The odd ratios for pCR were slightly lower with lower ICC. Even with an extremely low ICC of 0.2, 99% of the analyses had one or more significant cutpoints. The survival endpoint DFS was shown to be very stable despite increased interpathologist variance in all three clinical cohorts. Even with a poor ICC of 0.4, the majority of cutpoints were significant for DFS. For IBCSG VIII+IX 85% of the analyses with an ICC of 0.4 had one or more significant cutpoints for Ki67. In the large BIG 1-98 dataset (n=6090) even an ICC of 0.2 resulted in one or more significant DFS cutpoints in 100% of the analyses. Comparable results were obtained if the analysis was restricted to luminal tumors. Conclusion: Our results suggest that Ki67 is extremely robust to pathologist variation. Even if less than 40% of the variance is attributable to true Ki67-based proliferation (ICC<0.4), this percentage of information is sufficient to obtain statistically significant differences. This stable performance of Ki67 might provide an explanation for the observation that many Ki67 studies achieve significant results despite the interobserver variance and heterogeneity issues. It might also suggest a relevant clinical utility for Ki67 despite considerable variation introduced in the evaluation. Ongoing efforts to further reduce interobserver variability, however, should be continued. Citation Format: Denkert C, Budczies J, Regan M, Loibl S, Dell'Orto P, von Minckwitz G, Mastropasqua M, Mehta K, Müller V, Kammler R, Pfitzner BM, Fasching PA, Viale G. Systematic analysis and modulation of Ki67 interobserver variance in 9069 patients from three clinical trials – How much pathologist concordance is needed for meaningful biomarker results?. [abstract]. In: Proceedings of the Thirty-Eighth Annual CTRC-AACR San Antonio Breast Cancer Symposium: 2015 Dec 8-12; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2016;76(4 Suppl):Abstract nr P5-07-02.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call