Abstract

Context: Despite the effort put into developing standards for structuring construction costs and the strong interest in the field, most construction companies still perform the process of data gathering and processing manually. This provokes inconsistencies, different criteria when classifying, misclassifications, and the process becomes very time-consuming, particularly in large projects. Additionally, the lack of standardization makes cost estimation and comparison tasks very difficult. Objective: The aim of this work was to create a method to extract and organize construction cost and quantity data into a consistent format and structure to enable rapid and reliable digital comparison of the content. Methods: The approach consisted of a two-step method: firstly, the system implemented data mining to review the input document and determine how it was structured based on the position, format, sequence, and content of descriptive and quantitative data. Secondly, the extracted data were processed and classified with a combination of data science and experts’ knowledge to fit a common format. Results: A large variety of information coming from real historical projects was successfully extracted and processed into a common format with 97.5% accuracy using a subset of 5770 assets located on 18 different files, building a solid base for analysis and comparison. Conclusions: A robust and accurate method was developed for extracting hierarchical project cost data to a common machine-readable format to enable rapid and reliable comparison and benchmarking.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.