Using Different Machine Learning Algorithms to Predict the Prices of Flight Tickets

Jeremy Rohan Bollack,Joseph Anthony Vincent

doi:10.47611/jsrhs.v12i4.5303

Abstract

The rising prices of flight tickets and the lack of transparency in the dynamic pricing strategies of airlines have caused many consumers to wonder, what factors actually determine these prices. In order to investigate this question, a large dataset of flight ticket bookings that includes the most price-defining variables was acquired. This data was preprocessed using discretization, normalization, and principal component analysis. This preprocessed data was then used to train 5 different Machine Learning algorithms: Linear Regression, DecisionTree, Ridge Regression, RandomForest, and SVR. The training of the RandomForest and SVR models was not possible due to runtime errors, however, the other models trained as expected. All models performed well, with the Linear Regression and Ridge Regression performing identically. Overall, the DecisionTree model performed the best at predicting the prices of flights, and by adjusting hyperparameters the performance could be further increased. The investigation could be continued by using a larger dataset to investigate how the model performs with more variables and under broader conditions. Additionally, the model could be reappropriated to make a user-friendly flight price prediction tool that helps consumers with their purchasing decisions.

Full Text