Abstract

With the increasing number of data-driven models in nuclear applications, large volumes of numerical data are requiblack to accurately model and pblackict the health status of a plant component. However, many historical operation logs that contain useful information are not fully utilized due to the lack of a systematic approach of digitization. To overcome this issue, this study proposes an automatic pipeline for extracting information from handwritten tabular documents collected from nuclear power plants. In our pipeline, we first denoise scanned documents with morphological operations, and then extract relevant parts from individual pages using both traditional computer vision and neural network methods. Handwriting recognition is applied to obtain text and numbers. As the most challenging step is how to crop only relevant information, the main focus of our paper is to detect tables and cells from scanned handwritten documents. We evaluate the efficiency and accuracy of our proposed method on handwritten operational reports obtained from a real-world case study. The results demonstrate the high accuracy and practicality of our proposed method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.