Abstract

One of efforts by the Indonesian people to defend the country is to preserve and to maintain the regional languages. The current era of modernity makes the regional language image become old-fashioned, so that most them are no longer spoken. If it is ignored, then there will be a cultural identity crisis that causes regional languages to be vulnerable to extinction. Technological developments can be used as a way to preserve regional languages. Digital image-based artificial intelligence technology using machine learning methods such as machine translation can be used to answer the problems. This research will use Deep Learning method, namely Convolutional Neural Networks (CNN). Data of this research were 1300 alphabetic images, 5000 text images and 200 vocabularies of Minangkabau regional language. Alphabetic image data is used for the formation of the CNN classification model. This model is used for text image recognition, the results of which will be translated into regional languages. The accuracy of the CNN model is 98.97%, while the accuracy for text image recognition (OCR) is 50.72%. This low accuracy is due to the failure of segmentation on the letters i and j. However, the translation accuracy increases after the implementation of the Leveinstan Distance algorithm which can correct text classification errors, with an accuracy value of 75.78%. Therefore, this research has succeeded in implementing the Convolutional Neural Networks (CNN) method in identifying text in text images and the Leveinstan Distance method in translating Indonesian text into regional language texts.

Highlights

  • Salah satu upaya bela negara yang dapat dilakukan oleh masyarakat Indonesia yaitu melestarikan dan memertahankan bahasa daerah

  • be a cultural identity crisis that causes regional languages to be vulnerable to extinction

  • Technological developments can be used as a way to preserve regional languages

Read more

Summary

Metode Penelitian

Tahapan penelitian meliputi beberapa tahapan, yaitu studi literatur, pengumpulan data, pembentukan model klasifikasi OCR CNN dan evaluasi model klasifikasi. Gambaran umum tahapan penelitian dapat dilihat pada menggunakan Leveinstan Distance dalam melakukan perbaikan otomatis pada hasil terjemahan Bahasa Bengali ke Bahasa Inggris dengan tingkat akurasi sebesar 78.13%. Penelitian Wint, Ducros, dan Aritsugi [20] menggunakan Leveinstan Distance untuk. Melalukan perbaikan ejaan pada dataset sosial media dengan tingkat akurasi 90%. Arifudin, dan Alamsyah [21] juga menggunakan Leveinstan Distance untuk melakukan autocomplete dan spell checking dalam proses pencarian data perpustakaan

Studi Literatur
Pembentukan Model Klasifikasi CNN
Dataset
Findings
Hasil Eksperimen dan Evaluasi
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call