Abstract

Dysphagia has a large impact on individual patients and the society. However, the whole mechanism has not been analyzed. In order to understand dysphagia, it is essential to describe the anatomical features of cervical structures during swallowing. This study aims to segment cervical intervertebral disks (IDs) in videofluorography (VF) by multi channelization (MC) and convolutional neural network (CNN). The frame images of VF are gray-scale images. In the MC process, feature images are generated by applying image filters, such as the sobel filter and morphological tophat transform filter, to the frame images of VF. Among the feature images, three images are selected, and then color images are generated by setting the selected images to the RGB channels of the color images. The color images are input into CNN for segmentation. The proposed method is applied to actual VF, and experimental results are shown.

Highlights

  • Swallowing is a vital reflex in human life, but the entire relationship between bones and cartilages during swallowing has not been analyzed yet

  • There is no clear consensus on the shape analysis of intervertebral disks (IDs) for dysphagic patients

  • Ngiam et al proposed multimodal deep learning into which several modalities such as audio and video are input (6). These inputs are separately set in the Restricted Boltzmann Machines (RBMs), and are trained by a bimodal deep belief network model (DBM)

Read more

Summary

Introduction

Swallowing is a vital reflex in human life, but the entire relationship between bones and cartilages during swallowing has not been analyzed yet. There is no clear consensus on the shape analysis of IDs for dysphagic patients. There are engineering studies to segment the IDs of dysphagic patients (3,4). Wang et al proposed joint learning for person re-identification (5). With this network, single image representation (SIR) and cross-image representation (CIR) are achieved in order to label trained people (probes) and unknown people (galleries). Ngiam et al proposed multimodal deep learning into which several modalities such as audio and video are input (6). These inputs are separately set in the Restricted Boltzmann Machines (RBMs), and are trained by a bimodal deep belief network model (DBM). By using a deep autoencoder model, both audio and video are reconstructed from video input

Objectives
Methods
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call