Abstract

Due to the ubiquity of mobile phones, mobile phone network data (e.g., Call Detail Records, CDR; and cellular signaling data, CSD), which are collected by mobile telecommunication operators for maintenance purposes, allow us to potentially study travel behaviors of a high percentage of the whole population, with full temporal coverage at a comparatively low cost. However, extracting mobility information such as transport modes from these data is very challenging, due to their low spatial accuracy and infrequent/irregular temporal characteristics. Existing studies relying on mobile phone network data mostly employed simple rule-based methods with geographic data, and focused on easy-to-detect transport modes (e.g., train and subway) or coarse-grained modes (e.g., public versus private transport). Meanwhile, due to the lack of ground truth data, evaluation of these methods was not reported, or only for aggregate data, and it is thus unclear how well the existing methods can detect modes of individual trips. This article proposes two supervised methods - one combining rule-based heuristics (RBH) with random forest (RF), and the other combining RBH with a fuzzy logic system - and a third, unsupervised method with RBH and k-medoids clustering, to detect fine-grained transport modes from CSD, particularly subway, train, tram, bike, car, and walk. Evaluation with a labeled ground truth dataset shows that the best performing method is the hybrid one with RBH and RF, where a classification accuracy of 73% is achieved when differentiating these modes. To our knowledge, this is the first study that distinguishes fine-grained transport modes in CSD and validates results with ground truth data. This study may thus inform future CSD-based applications in areas such as intelligent transport systems, urban/transport planning, and smart cities.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call