Abstract

Abstract In the O&G (Oil & Gas) industry, unstructured data sources such as technical reports on hydrocarbon production, daily drilling, well construction, etc. contain valuable information. This information however is conveyed through various formats such as tables, forms, text, figures, etc. Detecting these different entities in documents is essential for building a structured representation of the information within and for automated processing of documents at scale. Our work presents a document layout analysis workflow to detect/localize different entities based on a deep learning-based framework. The workflow comprises of a deep learning-based object-detection framework based on transformers to identify the spatial location of entities in a document page. The key elements of the object-detection pipeline include a residual network backbone for feature extraction and an encoder-decoder transformer based on the latest detection transformers (DETR) to predict object-bounding boxes and category labels. The object detection is formulated as a direct set prediction task using bipartite matching while also eliminating conventional operations like anchor box generation and non-maximal suppression. The availability of sufficient publicly available document layout data sets that incorporate the artifacts observed in historical O&G technical reports is often a major challenge. We attempt to address this challenge by using a novel training data augmentation methodology. The dense occurrence of elements in a page can often introduce uncertainties resulting in bounding boxes cutting through text content. We adopt a bounding box post-processing methodology to refine the bounding box coordinates to minimize undercuts. The proposed document layout analysis pipeline was trained to detect entity types such as headings, text blocks, tables, forms, and images/charts in a document page. A wide range of pages from lithology, stratigraphy, drilling, and field development reports were used for model training. The reports also included a considerable number of historical scanned reports. The trained object-detection model was evaluated on a test data set prepared from the O&G reports. DETR demonstrated superior performance when compared with the Mask R-CNN on our dataset.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.