Abstract

Manufacturing text often exists as unlabeled data; the entity is fine-grained and the extraction is difficult. The above problems mean that the manufacturing industry knowledge utilization rate is low. This paper proposes a novel Chinese fine-grained NER (named entity recognition) method based on symmetry lightweight deep multinetwork collaboration (ALBERT-AttBiLSTM-CRF) and model transfer considering active learning (MTAL) to research fine-grained named entity recognition of a few labeled Chinese textual data types. The method is divided into two stages. In the first stage, the ALBERT-AttBiLSTM-CRF was applied for verification in the CLUENER2020 dataset (Public dataset) to get a pretrained model; the experiments show that the model obtains an F1 score of 0.8962, which is better than the best baseline algorithm, an improvement of 9.2%. In the second stage, the pretrained model was transferred into the Manufacturing-NER dataset (our dataset), and we used the active learning strategy to optimize the model effect. The final F1 result of Manufacturing-NER was 0.8931 after the model transfer (it was higher than 0.8576 before the model transfer); so, this method represents an improvement of 3.55%. Our method effectively transfers the existing knowledge from public source data to scientific target data, solving the problem of named entity recognition with scarce labeled domain data, and proves its effectiveness.

Highlights

  • Manufacturing is focused on production experience, and is essential to mining and reuse industry knowledge

  • ALBERT-AttBiLSTM-conditional random field (CRF) is shown in Figure 5, consisting of the ALBERT layer, bidirectional long short-term memory network (BiLSTM) layer, long short-term memory networks (LSTM) hidden layers are calculated separately, and the output sequences of the two directions are

  • This feature matrix is used as the input in the step of ALBERT-AttBiLSTM-CRF method

Read more

Summary

Introduction

Manufacturing is focused on production experience, and is essential to mining and reuse industry knowledge. As manufacturing and technology continue to evolve, more and more researchers are focusing on mining manufacturing data using advanced technologies to help with manufacturing production [1]. Technologies such as deep learning and artificial intelligence have been applied to traditional manufacturing industries. The knowledge of the manufacturing industry often exists as unstructured textual data, such as manufacturing standards, Symmetry 2020, 12, 1986; doi:10.3390/sym12121986 www.mdpi.com/journal/symmetry. Unstructured textual data contain a large amount of information about manufacturing. How to mine previous empirical knowledge from these unstructured data has become an important problem faced by industry researchers

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.