Utility Analysis of Lip Features in Distinguishing Chinese Vowels and Lip Reading

Jianrong Wang,Yichao Zhang,Yu Chen,Wei Liu

doi:10.1088/1742-6596/1544/1/012179

Jianrong Wang, Yichao Zhang + Show 2 more

Open Access

https://doi.org/10.1088/1742-6596/1544/1/012179

Copy DOI

Abstract

The lip region provides the most direct visual information in the process of multi-sensory speech perception, which is applied to speech recognition and lip reading. In this paper, we extract eight lip features in articulating the basic vowels [a], [e], [i], [u], [ü] in standard Chinese, and analyze the efficiency in distinguishing the five vowels combined with articulatory phonetics. We use Dense Convolutional Network (DenseNet) to process two-dimensional lip images and fuse the lip features to identify the Chinese with consonants. The results show that the application of lip shape features in Chinese vowel recognition and Chinese consonant lip reading is consistent. Two-dimensional lip images can effectively improve the recognition rate by fusing lip features in lip reading.

Full Text