Abstract

Abstract. Understanding individual travel behavior is vital in travel demand management as well as in urban and transportation planning. New data sources including mobile phone data and location-based social media (LBSM) data allow us to understand mobility behavior on an unprecedented level of details. Recent studies of trip purpose prediction tend to use machine learning (ML) methods, since they generally produce high levels of predictive accuracy. Few studies used LSBM as a large data source to extend its potential in predicting individual travel destination using ML techniques. In the presented research, we created a spatio-temporal probabilistic model based on an ensemble ML framework named “Random Forests” utilizing the travel extracted from geotagged Tweets in 419 census tracts of Greater Cincinnati area for predicting the tract ID of an individual’s travel destination at any time using the information of its origin. We evaluated the model accuracy using the travels extracted from the Tweets themselves as well as the travels from household travel survey. The Tweets and survey based travels that start from same tract in the south western parts of the study area is more likely to select same destination compare to the other parts. Also, both Tweets and survey based travels were affected by the attraction points in the downtown of Cincinnati and the tracts in the north eastern part of the area. Finally, both evaluations show that the model predictions are acceptable, but it cannot predict destination using inputs from other data sources as precise as the Tweets based data.

Highlights

  • With the evolution of urban travel demand models from aggregate to disaggregate models (Rasouli and Timmermans, 2014), there is a growing need of managing disaggregate travel data with spatial and temporal components in a GIS environment

  • Understanding travel behavior is vital in travel demand management as well as in urban and transportation planning (Yue et al, 2014; Beiró et al, 2016)

  • The more recent studies tend to use machine learning (ML) methods since they generally produce higher levels of predictive accuracy than probabilistic and rule-based methods (Ermagun et al, 2017). (Deng and Ji, 2010) present a ML approach to deriving trip purpose from Global Positioning System (GPS) track data coupled with other relevant data sources

Read more

Summary

INTRODUCTION

With the evolution of urban travel demand models from aggregate to disaggregate models (Rasouli and Timmermans, 2014), there is a growing need of managing disaggregate travel data with spatial and temporal components in a GIS environment. In addition to geotagged Tweets, The Household Travel Survey data provided by the OhioKentucky-Indiana Regional Council of Governments (OKI) to explore the level of accuracy of the LBSM based model in predicting the destination of the trips extracted from other data sources. Between August 2009 and August 2010, the OKI Regional Council of Governments collected detailed travel data from 1137 households who carried around a Global Positioning System (GPS) handset tool when taking a trip This survey recorded the trip information for each individual of every sampled household during weekdays, including trip purpose, origin locations, destination locations, transportation means, trip count, travel time, travel distance, and so on.

Random Forest model
Calculation of the Model Parameters
Model Evaluation
EMPIRICAL RESULTS
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call