Endoscopic assessment of mucosal disease activity is routinely used to determine eligibility and response to therapy in clinical trials of ulcerative colitis. The operating properties of the existing endoscopic scoring indices are unclear. A systematic review was undertaken to evaluate the development and operating characteristics of endoscopic scoring indices for the evaluation of ulcerative colitis. We searched MEDLINE, Embase and CENTRAL from inception to 5 July 2016. We also searched references and conference proceedings (Digestive Disease Week, United European Gastroenterology Week, European Crohn's and Colitis Organization). Any study design (e.g. randomized controlled trials, cohort studies, case series) that evaluated endoscopic indices for evaluation of ulcerative colitis disease activity were considered for inclusion. Eligible participants were adult patients (> 16 years), diagnosed with ulcerative colitis using conventional clinical, radiologic and endoscopic criteria. Two authors independently reviewed the studies identified from the literature search. These authors also independently extracted and recorded data on the number of patients enrolled; number of patients per treatment arm; patient characteristics including age and gender distribution; endoscopic index; and outcomes such as reliability (intra-rater and inter-rater), validity (content, construct, criterion), responsiveness and feasibility. Any disagreements regarding study inclusion or data extraction were resolved by discussion and consensus with a third author. Risk of bias was assessed by determining whether assessors were blinded to clinical information and whether assessors scored the endoscopic index independently. We also assessed the methodological quality of the validation studies using the COSMIN checklist MAIN RESULTS: A total of 23 reports of 20 studies met the pre-defined inclusion criteria and were included in the review. Of the 20 included validation studies, 19 endoscopic scoring indices were assessed, including the Azzolini Classification, Baron Score, Blackstone Endoscopic Interpretation, Chinese Grading System of Ulcerative Colitis, Endoscopic Activty Index, Jeroen Score, Magnifying Colonoscopy Grade, Matts Score, Mayo Clinic Endoscopic Subscore, Modified Baron Score, Modified Mayo Clinic Endoscopic Subscore, Osada Score, Rachmilewtiz Endoscopic Score, St. Mark's Index, Ulcerative Colitis Colonoscopic Index of Serverity (UCCIS), endoscopic component of the Ulcerative Colitis Disease Activity Index (UCDAI), Ulcerative Colitis Endoscopic Index of Severity (UCEIS), Witts Sigmoidoscopic Score and Watson Grade. The individuals who performed the endoscopic scoring were blinded to clinical and/or histologic information in ten of the included studies, not blinded to clinical and/or histologic information in one of the included studies, and it was unclear whether blinding occurred in the remaining nine included studies. Independent observation was confirmed in four of the included studies, unclear in five of the included studies, and non-applicable (since inter-rater reliability was not assessed) in the remaining eleven included studies. The methodological quality (COSMIN checklist) of most of the included studies was rated as 'good' or 'excellent'. One study that assessed responsiveness was rated as 'fair'. The inter-rater reliability of nine endoscopic scoring indices including the Baron Score, Blackstone Endoscopic Interpretation, Endoscopic Activity Index, Matts Score, Mayo Clinic Endoscopic Subscore, Osada Score, UCCIS, UCEIS, Watson Grade was assessed in seven studies, with estimates of correlation, ƙ, ranging from 0.44 to 0.97. The iIntra-rater reliability of seven endoscopic scoring indices including the Baron Score, Blackstone Endoscopic Interpretation, Matts Score, Mayo Clinic Endoscopic Subscore, Osada Score, UCCIS and UCEIS was assessed in three studies, with estimates of correlation, ƙ, ranging from 0.41 to 0.86. No studies assessed content validity. Three studies evaluated the criterion validity of three endoscopic scoring indices including the Rachmilewitz Endoscopic Score, Magnifying Colonoscopy Grade and the UCCIS. These indices were correlated with objective markers of disease activity including albumin, blood leukocytes, C-reactive protein, fecal calprotectin, hemoglobin, mucosal interleukin-8 concentration and platelet count. Correlation estimates ranged from r = -0.19 to 0.83. Thirteen endoscopic scoring indices were tested for construct validity in 13 studies. Estimates of correlation between the endoscopic scoring indices and other measures of disease activity ranged from r = 0.27 to 0.93. Two studies explored the responsiveness of four endoscopic scoring indices including the Mayo Endoscopic Subscore, Modified Baron Score, Modified Mayo Endoscopic Subscore and UCEIS. One study concluded that the Modified Baron Score, Modified Mayo Endoscopic Subscore and UCEIS had similar responsiveness for detecting disease change in ulcerative colitis. The other included study concluded that the UCEIS may be the most accurate endoscopic scoring tool. None of the included studies formally assessed feasibility. While the UCEIS, UCCIS and Mayo Clinic Endoscopic Subscore have undergone extensive validation, none of these instruments have been fully validated and only two studies assessed responsiveness. Further research on the operating properties of these indices is needed given the lack of a fully-validated endoscopic scoring instrument for the evaluation of disease activity in ulcerative colitis.