Setting up workable budgets symbolizes the competence of state highway agencies (SHAs) in fulfilling their responsibilities, and unreliable cost estimates can cause economic and political complications. The unclear scope definition and scarcity of project information available at early stages make it hard to generate reliable preliminary estimates. Hence, based on the 1,249 projects retrieved from the Florida Department of Transportation (FDOT) database, this research aimed to develop a cost estimation model using statistical learning methods for SHAs to forecast preliminary costs during the early stages of a transportation project to fulfill different cost control and managerial functions. However, the currently used methods have serious limitations. This study introduced alternative statistical learning approaches to the currently most used methods: least absolute shrinkage and selection operator (LASSO) and general regression neural network (GRNN). LASSO regression, for instance, has proved in other areas of science to be remarkably better in terms of variable selection, interpretability, and numerical stability. In addition, this study also accounted for economic factors in model development because economic conditions are influential on highway construction costs but have received limited attention. Using the same dataset, LASSO and GRNN models were developed, and then their performances were evaluated based on a set of criteria, e.g., the mean absolute error and mean absolute percentage error. In comparison to the current practice with state DOTs, this research contributes to the body of knowledge by introducing a series of objective modeling approaches that can prevent human errors, requiring no substantial experience in preliminary estimating. Besides the introduction of statistical learning methods, this study took economic indicators into account when developing the models because they are important factors but have been ignored in previous studies. In addition, these statistical learning methods can produce reliable estimates in a much faster and more consistent fashion, which is critical, particularly considering the massive workload faced by most SHAs and the allowable time to make a preliminary estimate.
Read full abstract