Abstract

With the advent of the incorporation of GPS receivers and then GPS-enabled smartphones in transportation data collection, many studies have looked at how to infer meaningful information from this data. Research in this field has concentrated on the use of heuristics and supervised machine learning methods to detect: trip ends, trip itineraries, travel mode and trip purpose. All the methods used until now have depended on methods relying uniquely on fully-validated data. However, respondent burden associated with validation lowers participation rates and results in less reliable data. In this paper, we propose the use of semi-supervised methods that use both validated and un-validated data. We compare the accuracy for two popular supervised methods (i.e. decision tree and random forest) with a simple semi-supervised method (i.e. label propagation with KNN kernel). We use speed, duration and length of trip, as well as proximity of trip start and end points to the transit network to detect mode of transport. The results show that the semi-supervised method slightly outperforms the supervised methods in the presence of high portions of unvalidated data, while run-times of the more efficient of the two supervised methods was on average almost 16 times longer than the average run-times of the semi-supervised method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.