Abstract

Taxi data are an underused source of travel information. A handful of research has been concerned with the processing of raw taxi GPS data to minimize random error. The study of methods that systematically detect erroneous data has, however, received less attention. Generally, an origin and a destination are identified when the taxi occupancy status (occupied/vacant) shifts. Information would be wrongly recorded if taxi drivers incorrectly operate their device or missed signals when the status of their taxi changes. It leads to extremely short trips or long trips. This study proposes a set of criteria to evaluate the accuracy of trips, imputed from taxi GPS data. In particular, attributes such as inaccurate signal, mismatch of movement and speed, abnormal average speed, and mismatch of trips length measured on maps and calculated from records are suggested. Taxi data should pass these tests if trips have been identified accurately. Using these suggested criteria, the accuracy of 150 million GPS records, collected in Guangzhou, China is evaluated.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call