Abstract

The primary objective of this study was to investigate how trip pattern variables extracted from large-scale taxi GPS data contribute to the spatially aggregated crashes in urban areas. The following five types of data were collected: crash data, large-scale taxi GPS data, road network attributes, land use features and social-demographic data. A data-driven modeling approach based on Latent Dirichlet Allocation (LDA) was proposed for discovering hidden trip patterns from a taxi GPS dataset, and a total of fifty trip patterns were identified. The collected data and the identified trip patterns were further aggregated into167 ZIP Code Tabulation Areas (ZCTA). Random forest technique was used to identify the factors that contributed to total, PDO and fatal-plus-injury crashes in the selected ZCTAs during the study period. Geographically weighted Poisson regression (GWPR) models were then developed to establish a relationship between the crashes and the contributing factors selected by the random forest technique. Comparative analyses were conducted to compare the performance of the GWPR models that considered traditional traffic exposure variables only, trip pattern variables only, and both traditional exposure and trip pattern variables. The model specification results suggest that the trip pattern variables significantly affected the crash counts in the selected ZCTAs, and the models that considered both the traditional traffic exposure and the trip pattern variables had the best goodness-of-fit in terms of the lowest MAD and AICc values.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call