Abstract

The Zhuang ethnic minority in China possesses its own ethnic language and no ethnic script. Cultural exchange and transmission encounter hurdles as the Zhuang rely exclusively on oral communication. An online cloud-based platform was required to enhance linguistic communication. First, a database of 200 h of annotated Zhuang speech was created by collecting standard Zhuang speeches and improving database quality by removing transcription inconsistencies and text normalization. Second, SAformerNet, a more efficient and accurate transformer-based automatic speech recognition (ASR) network, is achieved by inserting additional downsampling modules. Subsequently, a Neural Machine Translation (NMT) model for translating Zhuang into other languages is constructed by fine-tuning the BART model and corpus filtering strategy. Finally, for the network’s responsiveness to real-world needs, edge-computing techniques are applied to relieve network bandwidth pressure. An edge-computing private cloud system based on FPGA acceleration is proposed to improve model operation efficiency. Experiments show that the most critical metric of the system, model accuracy, is above 93%, and inference time is reduced by 29%. The computational delay for multi-head self-attention (MHSA) and feed-forward network (FFN) modules has been reduced by 7.1 and 1.9 times, respectively, and terminal response time is accelerated by 20% on average. Generally, the scheme provides a prototype tool for small-scale Zhuang remote natural language tasks in mountainous areas.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call