Abstract
Aviation safety reports are essential sources for the identification and analysis of risks in civil aviation. These reports are written in plain language, which requires the application of Natural Language Processing techniques for automatic and intelligent treatment. In the case of Brazil, the vast majority of reports are written in Portuguese. Therefore, for comparison with international database of reports that are written in English, a first step is the translation of Brazilian reports. In this work, a proposal for a machine translation model is presented based on the fine-tuning of pre-trained models. To this end, an aviation-specific language corpus is developed with the objective of generating example data for model training. Finally, a pre-trained model is fine-tuned with the corpus created in order to implement an automatic translation model that achieves good results in the task considering the specifics of the aviation language. As a result, a first model is implemented, presenting coherent results of translation between Portuguese/English in the specific domain of aviation.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have