Abstract

In the current era of big data, huge volumes of valuable data can be easily generated and collected at a rapid velocity from a wide variety of rich data sources. In recent years, the initiates of open data also led to the willingness of many government, researchers, and organizations to share their data and make them publicly accessible. An example of open big data is transportation data such as public bus performance data. Analyzing these open big data can be for social good. For instance, by analyzing and mining the public bus performance data, the bus service providers could get an insight on the on-time performance or delay in bus services. By taking appropriate actions (e.g., adding more buses, rerouting some buses routes, etc.) could enhance rider experience. In this paper, we present a Bayesian framework for supporting predictive analytics over big transportation data. Specifically, our framework consists of several Bayesian networks to predict whether a bus arrive late than its scheduled time at a given bus stop. We analyze and determine the network configurations and/or parameter permutation to produce the best result for each (bus stop, bus route, arrival time)-triplet. Evaluation on an open big data for public transit bus from a North American city shows the effectiveness and practicality of our Bayesian framework in supporting predictive analytics on big open data for transportation analytics.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call