The rapid development of welding fabrication demands guidance data to support expert decision-making, retroactive process, and production pattern upgrades. Yet, the data is lacking. This paper developed the guidance corpus for welding manufacturing. The unstructured document guiding actual production was collected, and the data were segmented into words employing a conditional random field model. In addition, we organized an annotation team and developed the tool to accomplish data annotation using a Begin-Inside-Outside method. Machine learning and deep learning models were trained for the named entity recognition task under supervised learning to accomplish the challenge of corpus scalability. The corpus contained 19,410 labeled pairs of samples and involved multiple entity categories, that is, “standard,”“technology,”“design,”“department,”“manufacturing,” and “quality.” The baseline of the model on Precision, Recall, and F1-score metrics were listed to provide a reference for welding production research. Furthermore, we also gave some examples of task assignments and knowledge retrieval in conjunction with real production problems. The long-term goal was continually to enhance the corpus through more researchers, suggesting that a robust corpus would support more research and engineering applications in the welding domain.