Abstract

Recognition of transportation modes can be used in different applications including human behavior research, transport management and traffic control. Previous work on transportation mode recognition has often relied on using multiple sensors or matching Geographic Information System (GIS) information, which is not possible in many cases. In this paper, an approach based on ensemble learning is proposed to infer hybrid transportation modes using only Global Position System (GPS) data. First, in order to distinguish between different transportation modes, we used a statistical method to generate global features and extract several local features from sub-trajectories after trajectory segmentation, before these features were combined in the classification stage. Second, to obtain a better performance, we used tree-based ensemble models (Random Forest, Gradient Boosting Decision Tree, and XGBoost) instead of traditional methods (K-Nearest Neighbor, Decision Tree, and Support Vector Machines) to classify the different transportation modes. The experiment results on the later have shown the efficacy of our proposed approach. Among them, the XGBoost model produced the best performance with a classification accuracy of 90.77% obtained on the GEOLIFE dataset, and we used a tree-based ensemble method to ensure accurate feature selection to reduce the model complexity.

Highlights

  • Understanding user mobility from trajectory data has received considerable attention among users and developers

  • We propose a method to infer hybrid transportation modes only from Global Position System (GPS) data to achieve a good performance in a real life trajectory dataset collected by Microsoft Geolife Project [14,15]

  • Ensemble methods are widely used in many machine learning challenges; for example, Random Forest was used in 2010 Knowledge Discovery and Data Mining (KDD) Cup to win the first prize [20], the Gradient Boosting Decision Tree (GBDT) was used in the Netflix prize [18], andISPXRGS BInot.oJ.sGtewo-Iansf. 2s0u1c7c, 6e,s5s7fully used in Kaggle competition and 2015 KDD Cup [19]

Read more

Summary

Introduction

Understanding user mobility from trajectory data has received considerable attention among users and developers. The system may recommend residential or transportation services if somebody is traveling on a train It is a significant part of transportation planning [4] and traffic management, where transportation mode selection greatly influences transportation planning. Almost all studies use traditional machine learning algorithms such as Decision Tree, K-Nearest Neighbor and Support Vector Machines [6,7,10,11,12,13,14,15,16,17] These methods were found to be worse than ensemble methods, in terms of accuracy and model robustness [18,19,20,21]. Related experiments of the study and discussion are presented before our conclusions on the study

Related Work
Global Features
Local Features
Model Classification and Model Evluation
Gradient Boosting Decision Tree
XGBoost
K-Fold Cross-Validation
Evaluations
Results and Discussion
K Nearest Neighbor
Decision Tree
Support Vector Machine
Random Forest
MoRdeelcCaollmparison
Model Comparison
Feature Importance and Feature Reduction

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.