Abstract

ABSTRACTAs an important research topic, real-time crash likelihood prediction has been studied for many years. However, few research focuses on the missing data imputation in real-time crash likelihood prediction, although missing values are commonly observed due to breakdown of sensors or external interference. Besides, classifying imbalanced data is also a critical issue in real-time crash likelihood prediction, since the number of crash-prone cases is much smaller than that of non-crash cases. In this paper, three principal component analysis (PCA) based approaches are established for imputing missing values, while two kinds of solutions are developed to tackle the issue of imbalanced data. The results show that the proposed methods can help the classifiers achieve better predictive performance under situations with missing data. The two solutions, i.e. cost-sensitive learning, and synthetic minority oversampling technique (SMOTE), can help improve the sensitivity by adjusting the classifiers to pay more attention to the minority class.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.