Leveraging Free-Form Text in Maintenance Logs Through BERT Transfer Learning

Syed Meesam Raza Naqvi,Noureddine Zerhouni,Jean-Marc Nicod,Mohammad Ghufran,Christophe Varnier

doi:10.1007/978-3-030-98531-8_7

Abstract

AbstractAcross various industries, maintenance entries recorded over time contain decades of experience and health records of different assets. Maintenance logs are usually free-form text entries recorded by maintenance operators. These records are also highly unstructured and imbalanced. Because of these reasons, this huge source of knowledge is usually underutilized and does not contribute to the development of tools to help improve maintenance processes. In the last few years, due to recent advancements in the field of Natural Language Processing (NLP) and increased focus on industry 4.0, there is a need to revisit this problem. This study explores the use of state-of-the-art NLP methods on free-form maintenance text to leverage this data. More specifically, the purpose of the study is to estimate the problem category of maintenance log to see how well recent NLP models adapt to free-form maintenance text. Findings of this study indicate that CamemBERT with the fine-tuning approach outperforms the classical NLP approaches. Class imbalance problem is also addressed through data augmentation using deep contextualized embedding showing further performance improvement. Model accuracy and Matthews correlation coefficient (MCC) are used as performance metrics to give a better understanding of results with imbalanced classes.KeywordsIndustry 4.0 (I4.0)NLPBERTMaintenance logsClassification

Full Text