Abstract

App-based ride-hailing mobility services are becoming increasingly popular in cities worldwide. However, key drivers explaining the balance between supply and demand to set final prices remain to a considerable extent unknown. This research intends to understand and predict the behavior of ride-hailing fares by employing statistical and supervised machine learning approaches (such as Linear Regression, Decision Tree, and Random Forest). The data used for model calibration correspond to a ten-month period and were downloaded from the Uber Application Programming Interface for the city of Madrid. The findings reveal that the Random Forest model is the most appropriate for this type of prediction, having the best performance metrics. To further understand the patterns of the prediction errors, the unsupervised technique of cluster analysis (using the k-means clustering method) was applied to explore the variation of the discrepancy between Uber fares predictions and observed values. The analysis identified a small share of observations with high prediction errors (only 1.96%), which are caused by unexpected surges due to imbalances between supply and demand (usually occurring at major events, peak times, weekends, holidays, or when there is a taxi strike). This study helps policymakers understand pricing, demand for services, and pricing schemes in the ride-hailing market.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call