Abstract

A common and challenging data and modeling aspect in crash analysis is unobserved heterogeneity, which is often handled using random parameters and special distributions such as Lindley. Random parameters can be estimated with respect to each observation for the entire dataset, and grouped across segments of the dataset, with variable means, or variable variances. The selection of the best approach to handle unobserved heterogeneity depends on the data characteristics and requires the corresponding hypothesis testing. In addition to dealing with unobserved heterogeneity, crash frequency modeling often requires explicit consideration of functional forms, transformations, and identification of likely contributing factors. During model estimation, it is important to consider multiple objectives such as in- and out-of-sample goodness-of-fit to generate reliable and transferable insights. Taking all of these aspects and objectives into account simultaneously represents a very large number of modeling decisions and hypothesis testing. Limited testing and model development may lead to bias and missing relevant specifications with important insights. To address these challenges, this paper proposes a comprehensive optimization framework, underpinned by a mathematical programming formulation, for systematic hypothesis testing considering simultaneously multiple objectives, unobserved heterogeneity, grouped random parameters, functional forms, transformations, heterogeneity in means, and the identification of likely contributing factors. The proposed framework employs a variety of metaheuristic solution algorithms to address the complexity and non-convexity of the estimation and optimization problem. Several metaheuristics were tested including Simulated Annealing, Differential Evolution and Harmony Search. Harmony Search provided convergence with low sensitivity to the choice of hyperparameters. The effectiveness of the framework was evaluated using three real-world data sets, generating sound and consistent results compared to the corresponding published models. These results demonstrate the ability of the proposed framework to efficiently estimate sound and parsimonious crash data count models while reducing costs associated with time and required knowledge, bias, and sub-optimal solutions due to limited testing. To support experimental testing for analysts and modelers, the Python package “MetaCountRegressor,” which includes algorithms and software, is available on PyPi.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.