Abstract

Many traffic-related applications, e.g. traffic demand modeling, rely on conventional data collection methods such as travel surveys. These methods can be demanding in terms of cost and time, which results in low network coverage and limited representativeness. On-motion sensors, e.g. smartphones, offer the opportunity to replace such methods and compensate for the aforementioned drawbacks by collecting positioning data automatically. GPS data consist of positioning records each of which has geographical coordinates and a timestamp associated. These data, however, require cleansing and processing before being put into use. Information about the used transportation mode is missing from this kind of data unless travelers were specifically asked to report it. In the literature, supervised machine learning (ML) algorithms were successful in inferring transportation modes from GPS data. However, these algorithms, unlike unsupervised ML algorithms, require training data that are not always available. This paper aims to investigate the capability of unsupervised ML algorithms to infer transportation modes from real GPS data extracted from smartphones. Therefore, we used two datasets to benchmark different unsupervised ML algorithms with different input attributes. The paper also investigates the feasibility of using a pre-trained model for unlabeled real data. Finally, we compared the best performing unsupervised setup to the supervised ML algorithms recommended in the literature. The results suggest that the recommended unsupervised setups can reach an overall inferring accuracy of 93%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call