Abstract Background Central reading using endoscopic scoring systems is recommended in clinical trials to assess disease severity in ulcerative colitis (UC). Human scoring of the Mayo Endoscopic Subscore (MES) of colonoscopy videos is time-consuming and can have a high inter-reader variability. The automatic extraction of salient frames from endoscopic videos can assist the development of computer-aided scoring, reduce local and central reader discrepancies and allow for a faster review process [1, 2]. We present an informative frame selection framework prioritizing the video segments most representative of disease activity, accelerating manual endoscopic scoring and enhancing video analysis systems. Methods An artificial intelligence (AI) system based on neural networks was trained on 104,572 endoscopic frames to automatically detect four image artifacts (blurriness, out-of-body, camera movements and obstructed views). A new temporal smoothing module enables full-length video analysis leveraging still-frame classification confidence levels. The system was evaluated on an independent dataset of retrospectively acquired videos from a large multicentric cohort (1,564 UC subjects, total video time: 357 hours). Videos were annotated for MES grading by six expert gastroenterologists using a 2 + 1 paradigm. Mean non-informative time and mean video informativeness level (mVIL) were evaluated to estimate the projected time savings by filtering non-representative frames. The system was integrated in a previously presented AI-powered reading platform, enabling video editing by removal of non-assessable intervals. Quadratic weighted kappa (QWK) was evaluated to determine the effect of video trimming on MES scoring by comparing the AI platform accuracy to manual annotations. Results Accuracy in detecting non-informative frames was 80.02% based on 12,575 independent samples. Non-informative filtering allowed for a mean reduction in video time of 3:09 minutes, corresponding to a mVIL of 77.61% and a total timesaving of 83 hours. When used in conjunction with an automatic MES grading system, very high correlation was observed between gastroenterologists and predicted scores both when evaluated over the whole dataset (QWK=0.690) and for concordant central and local readings (QWK=0.725). Video filtering improved the automatic MES classification accuracy by 1.1%. Conclusion The presented automatic AI system significantly reduces endoscopic video length by filtering non-informative frames and improves performance of automated MES grading algorithms. When tested on a large cohort of UC subjects, the system demonstrated the potential to reduce manual scoring time and reader fatigue without compromising objective MES assessment.
Read full abstract