Abstract

Data labeling is the task of supplementing data corresponding to the correct answer in training data for AI(Artificial Intelligence). Data labeling helps improve the performance of AI learning models, but is often done manually. Automating data labeling tasks so that high-quality learning data can be efficiently produced can serve as a foundation for the development of artificial intelligence.BR In this paper, we proposed a method using topic modeling to automate data labeling of error data produced in the smart factory control system. Error text files were created by extracting major error-related items and error messages from the database accumulated in the smart factory operating environment. Before the topic modeling, frequently appearing words were extracted through basic analysis of error text, and main causes of errors were roughly identified by visualizing them with bar graphs and word clouds. After that, major topics related to errors were extracted by applying topic modeling to the error text. Based on the key words included in the topics, meanings were given to each topic, error types were derived, and error type codes were also assigned. Coherence and Perplexity were calculated to derive the optimal number of topics, and 4-5 topics were found to be optimal. This paper is meaningful in that it confirmed the possibility of automating data labeling in big data including text data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call