Abstract

The methods of measuring laryngeal elevation during swallowing are time-consuming. We aimed to propose a quick-to-use neural network (NN) model for measuring laryngeal elevation quantitatively using anatomical structures auto-segmented by Mask region-based convolutional NN (R-CNN) in videofluoroscopic swallowing study. Twelve videofluoroscopic swallowing study video clips were collected. One researcher drew the anatomical structure, including the thyroid cartilage and vocal fold complex (TVC) on respective video frames. The dataset was split into 11 videos (4686 frames) for model development and one video (532 frames) for derived model testing. The validity of the trained model was evaluated using the intersection over the union. The mean intersections over union of the C1 spinous process and TVC were 0.73 ± 0.07 [0–0.88] and 0.43 ± 0.19 [0–0.79], respectively. The recall rates for the auto-segmentation of the TVC and C1 spinous process by the Mask R-CNN were 86.8% and 99.8%, respectively. Actual displacement of the larynx was calculated using the midpoint of the auto-segmented TVC and C1 spinous process and diagonal lengths of the C3 and C4 vertebral bodies on magnetic resonance imaging, which measured 35.1 mm. Mask R-CNN segmented the TVC with high accuracy. The proposed method measures laryngeal elevation using the midpoint of the TVC and C1 spinous process, auto-segmented by Mask R-CNN. Mask R-CNN auto-segmented the TVC with considerably high accuracy. Therefore, we can expect that the proposed method will quantitatively and quickly determine laryngeal elevation in clinical settings.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call