Abstract

This paper proposes a novel methodology for automatic detection and localization of gastrointestinal (GI) anomalies in endoscopic video frame sequences. Training is performed with weakly annotated images, using only image-level, semantic labels instead of detailed, and pixel-level annotations. This makes it a cost-effective approach for the analysis of large videoendoscopy repositories. Other advantages of the proposed methodology include its capability to suggest possible locations of GI anomalies within the video frames, and its generality, in the sense that abnormal frame detection is based on automatically derived image features. It is implemented in three phases: 1) it classifies the video frames into abnormal or normal using a weakly supervised convolutional neural network (WCNN) architecture; 2) detects salient points from deeper WCNN layers, using a deep saliency detection algorithm; and 3) localizes GI anomalies using an iterative cluster unification (ICU) algorithm. ICU is based on a pointwise cross-feature-map (PCFM) descriptor extracted locally from the detected salient points using information derived from the WCNN. Results, from extensive experimentation using publicly available collections of gastrointestinal endoscopy video frames, are presented. The data sets used include a variety of GI anomalies. Both anomaly detection and localization performance achieved, in terms of the area under receiver operating characteristic (AUC), were >80%. The highest AUC for anomaly detection was obtained on conventional gastroscopy images, reaching 96%, and the highest AUC for anomaly localization was obtained on wireless capsule endoscopy images, reaching 88%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call