Automatic Translation between Mixtec to Spanish Languages Using Neural Networks

Hermilo Santiago-Benito,Noé-Alejandro Castro-Sánchez,Teresa García-Ramirez,Julio-Alejandro Romero-González,Juan Terven,Diana-Margarita Córdova-Esparza

doi:10.3390/app14072958

Abstract

This paper introduces a novel method for collecting and translating texts from the Mixtec to the Spanish language. The method comprises four primary steps. First, we collected a Mixtec–Spanish corpus that includes 4568 sentences from educational and religious domain texts. To enhance the parallel corpus, we generate synthetic data with GPT-3.5. Second, we cleaned the data with a semi-automatic approach followed by preprocessing and tokenization. In preprocessing, we removed stop words, duplicated sentences, special characters, and numbers and converted them to lowercase. Third, we performed semi-automatic alignment to find the correspondence of Mixtec–Spanish sentences to generate sentence-level aligned texts necessary for translation. Finally, we trained automatic translation models based on recurrent neural networks, bidirectional recurrent neural networks, and Transformers. Our system achieved a BLEU score of 95.66 for Mixtec-to-Spanish translation and 99.87 for Spanish-to-Mixtec translation. We also obtained a translation edit rate (TER) of 0.5 for Spanish-to-Mixtec and a TER of 16.5 for Mixtec-to-Spanish. Our research stands out as a pioneering effort in the field of automatic Mixtec-to-Spanish translation in Mexico, filling a gap identified in the current literature.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Automatic Translation between Mixtec to Spanish Languages Using Neural Networks

Abstract

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Journal: Applied Sciences	Publication Date: Mar 31, 2024
License type: CC BY 4.0

Similar Papers

Improvement of bidirectional recurrent neural network for learning long-term dependencies
...
-
, et. al. ...
23 Aug 2004
23 Aug 2004

Stacked bidirectional LSTM RNN to evaluate the remaining useful life of supercapacitor
Chunli Liu ... Kai Wang
International Journal of Energy Research | VOL. 46
Chunli Liu, et. al.Chunli Liu ... Kai Wang
14 Oct 2021
International Journal of Energy Research | VOL. 46

Capturing Long-Term Dependencies for Protein Secondary Structure Prediction
Jinmiao Chen ... Narendra S Chaudhari
-
Jinmiao Chen, et. al.Jinmiao Chen ... Narendra S Chaudhari
01 Jan 2004
01 Jan 2004

Bidirectional segmented-memory recurrent neural network for protein secondary structure prediction
J Chen ... N.S Chaudhari
Soft Computing | VOL. 10
J Chen, et. al.J Chen ... N.S Chaudhari
18 May 2005
Soft Computing | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Automatic Translation between Mixtec to Spanish Languages Using Neural Networks

Abstract

Talk to us

Similar Papers

More From: Applied Sciences