Abstract Objective: Ki-67 Label Index (Ki-67LI) is a breast cancer(BC) predictive and prognostic factor. The lack of standardization and reproducibility of evaluation methods limits its use in routine work. In this study, Ki-67 standard comparison card (SRC) and artificial intelligence(AI) software were used to evaluate breast cancer Ki-67LI. We established training and validation sets and studied the repeatability between observers. Methods: A total of 300 invasive breast cancer specimens were randomly divided into training and verification sets, with each set including 150 cases. Breast cancer Ki-67 standard comparison cards ranging from 5% to 90% were created. The training set was interpreted by nine pathologists of different ages through microscopic visual assessment (VA), SRC, microscopic manual counting (MC), and AI. The validation set was interpreted by three randomly selected pathologists using SRC and AI. Friedman M was used to analyze the difference. The intra-group correlation coefficient (ICC) and Bland-Altman scatter plot were used for consistency analysis. Results: 1.Ki-67LI interpreted by the four methods in the training set did not obey a normal distribution (P<0.05). Friedman M test showed that the difference between pathologists using the same method was statistically significant (P<0.05). After Bonferroni correction, Ki-67LI interpreted using SRC and AI showed that the difference between each pathologist and the gold standard was statistically significant (P<0.05), and the difference between pathologists was not statistically significant (P>0.05); Ki-67LI interpreted using VA and MC showed that the difference between each pathologist and the gold standard and the difference between pathologists were statistically significant (P<0.05). 2. The intra-group correlation coefficient(ICC) obtained by nine pathologists in the training set that used SRC (ICC=0.918) and AI (ICC=0.972) to interpret Ki-67LI, was significantly higher than when VA (ICC=0.757) and MC (ICC=0.803) were used. 3. Through SRC, the initial and intermediate pathologists in the training set had an increased ICC. 4. In the homogeneous group of the training set, the agreement on observers of VA, MC, SRC, and AI among observes was very good, with all ICC values above 0.80. In the heterogeneous group, SRC and AI showed a good agreement among observers (ICC= 0.877 and 0.959, respectively). In the homogeneous and heterogeneous groups of validation sets, the consistency among the pathologists that used SRC and AI was very good, with an ICC of>0.90. 5. In the verification set, using SRC and AI, three pathologists obtained results that were very consistent with the gold standard, having an ICC above 0.95, and the inter-observer agreement was also very good, with an ICC of>0.9. Conclusion: AI has satisfactory inter-observer repeatability, and the true value was closer to the gold standard, which is the preferred method for Ki-67LI reproducibility; While AI software has not been popularized, SRC may be interpreted as breast cancer Ki-67LI's standard candidate method.Keywords: Breast cancer, Ki-67, Artificial intelligence, Ki-67 standard comparison card, Repeatability Citation Format: Lina Li, Yueping Liu. Artificial intelligence-assisted interpretation of Ki-67 expression and repeatability in breast cancer [abstract]. In: Proceedings of the 2020 San Antonio Breast Cancer Virtual Symposium; 2020 Dec 8-11; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2021;81(4 Suppl):Abstract nr PS2-29.
Read full abstract