The Research of Lip Reading Based on STCNN and ConvLSTM

Yijie Zhu

doi:10.1088/1742-6596/1651/1/012076

The Research of Lip Reading Based on STCNN and ConvLSTM

Yijie Zhu

Open Access

https://doi.org/10.1088/1742-6596/1651/1/012076

Copy DOI

Journal: Journal of Physics: Conference Series	Publication Date: Nov 1, 2020
License type: cc-by

Affiliation: Beijing Jiaotong University

#Spatiotemporal Convolutional Neural Networks #Convolutional Long Short-Term Memory + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

Aiming at the problems in temporal model during the research of lip reading, a deep learning model is proposed based on spatiotemporal convolutional neural networks (STCNN) and Convolutional Long Short-Term Memory (ConvLSTM). Firstly, STCNN is used to learn the features of the extracted lip image, and then the learned features are sent to ConvLSTM to process the time series data, which is classified by softmax, and finally the CTC loss function is used to optimize the results. Using GRID data set for training, comparing with experiments, it is found that the recognition accuracy of this model achieves 95.0% at the word level. Experiments show that the model can improve the accuracy of lip reading.

Full Text