Abstract Study question Does a standardized histopathological classification system lead to a robust interobserver agreement to diagnose chronic endometritis (CE) and can an accurate computation algorithm be developed? Summary answer The proposed scoring system has a particularly good to excellent interobserver agreement among pathologists. A machine-learning algorithm outperformed pathologists in the clinically most relevant 2-tier-system. What is known already Plasma cell (PC) presence in the stromal endometrium has been proposed as a marker for CE, yet no standardized classification system that considers the amount or distribution of PCs exists. Also, interobserver agreement has been insufficiently reported in the previous publications on the topic. Without a robust histopathological score, the potential clinical impact of CE on ART treatment outcomes cannot be investigated. This issue is strongly reflected in the heterogeneity of the current diagnostic and therapeutic approaches, specifically for patients experiencing repeated implantation failure (RIF) or recurrent pregnancy loss (RPL). Study design, size, duration A 5-tier classification system based on CD138-positive PCs (Roche B-A38/automated Ventana system) was established: class0 (no PCs), classIPC (isolated PCs), class1 (³1 cluster of 5-19 PCs/0.25mm2), class2 (³1 cluster of 20-49 PCs/0.25mm2), class3 (³1 cluster of 50 or more PCs/0.25mm2). A test set of 78 endometrial biopsies (collected over two years) was selected by a certified pathologist who sorted >1000 biopsies to ensure the full spectrum of PC counts/distribution was accounted for. Participants/materials, setting, methods The 3DHistech-platform digitized the slides (pixel size 0.2738x0.2738mm) and a scoring session was set up in the Pathomation software suite. Six pathologists scored independently after training. Overall agreement was expressed by Fleiss’ Kappa values. Pairwise interobserver agreement was calculated using Cohen’s Kappa statistics. A supervised machine-learning algorithm was developed in QuPath(V0.4.2) and considered a seventh observer in the interobserver agreement calculations. Main results and the role of chance The CD138-based semiquantitative scoring system showed particularly good to excellent interobserver agreement among all observers. Overall agreement values were 0.722 for the 5-tier system and 0.858 for the 2-tier system (0+IPC versus 1 + 2+3, which was considered the clinically most relevant classification, i.e. CE absent versus diagnosed). Pairwise interobserver agreement was 0.871-0.951. Intra-class correlations varied from 0.727 to 0.839 in the 5-tier system and 0.893 to 0.965 in the 2-tier system. All findings were statistically significant (p < 0.001). The supervised machine-learning algorithm was able to compete with human pathologist observers and outperformed them in the 2-tier system. The proposed classification system provides a framework for assessing the presence and severity of endometrial PC infiltration and can guide further research. Limitations, reasons for caution Inconsistent staining intensity, non-specific epithelial staining and background interference are common problems that might limit interlaboratory use of the algorithm. Furthermore, the algorithm has not yet been associated to patient/clinical parameters. Accurate analysis of an endometrial biopsy might still be insufficient to provide a formal diagnosis of CE. Wider implications of the findings To unravel the association between PCs in endometrial biopsies (as a proxy for the diagnosis of CE) and the reproductive outcomes following ART, a standardized methodology is of crucial importance for future studies. Higher accuracy, efficieny and automation are advantages when using computer-aided diagnostics for CE. Trial registration number BUN 143202042810