Summarization of Wireless Capsule Endoscopy Video Using Deep Feature Matching and Motion Analysis

B Sushma,P Aparna

doi:10.1109/access.2020.3044759

B Sushma, P Aparna

Open Access

PDF Available

https://doi.org/10.1109/access.2020.3044759

Copy DOI

Export

Save

Cite

Journal: IEEE Access	Publication Date: Dec 15, 2020
Citations: 15	License type: CC BY 4.0

Affiliation: National Institute of Technology Karnataka

Abstract
Highlights/Summary
Full-Text PDF
Similar Papers

Abstract

Listen

Conventional Wireless capsule endoscopy (WCE) video summary generation techniques apprehend an image by extracting hand crafted features, which are not essentially sufficient to encapsulate the semantic similarity of endoscopic images. Use of supervised methods for extraction of deep features from an image need an enormous amount of accurate labelled data for training process. To solve this, we use an unsupervised learning method to extract features using convolutional auto encoder. Furthermore, WCE images are classified into similar and dissimilar pairs using fixed threshold derived through large number of experiments. Finally, keyframe extraction method based on motion analysis is used to derive a structured summary of WCE video. Proposed method achieves an average F-measure of 91.1% with compression ratio of 83.12%. The results indicate that the proposed method is more efficient compared to existing WCE video summarization techniques.

Highlights

Wireless capsule endoscopy (WCE) is a non-invasive medical imaging procedure used to screen the entire gastrointestinal (GI) tract in order to detect various GI diseases [1]
It consists of 3 WCE videos and the complete description of this dataset is available in [26] and [27]
Around 50000 WCE frames are resized to a resolution of 256 × 256, captured at different location of the GI tract from different patients is used for training convolutional autoencoder neural network (CANN)

Summary

Introduction

Wireless capsule endoscopy (WCE) is a non-invasive medical imaging procedure used to screen the entire gastrointestinal (GI) tract in order to detect various GI diseases [1]. The capsule capture images at the rate of 3 to 6 frames per second for over 8 hours and acquires around 90000-180000 frames [2]. The capsule travels at a very slow speed of about 0.16-1 mm/s and captures 2-12 frames for every 1mm of its travelling distance [3]. Slow movement results in huge number of redundant frames with high structural similarity. A physician has to invest a lot of time or appoint an assistant to inspect these huge number of frames and summarize the endoscopy video by eliminating redundant frames. The major disadvantage associated in manual summarizing is a chance of eliminating some of the frames

Methods

Results

Conclusion