Abstract
A non-recurring incident often negatively affects traffic, which is represented as non-recurrent congestion. However, travellers can usually perceive congestion without knowing the underlying reasons. Accordingly, this paper proposes a data-driven framework for non-recurrent congestion detection and interpretation analysis. First, a statistical algorithm named generalized extreme studentized deviate is introduced to detect non-recurrent congestion by comparing the current traffic speed with the speed threshold learned from historical data. The case study in Beijing shows that the proposed generalized extreme studentized deviate outperforms other prevailing algorithms in terms of detection rate, false alarm rate, and mean detection time. Second, data mining and natural language processing technologies are implemented on data collected from Sina Weibo, a Chinese microblog site akin to Twitter, to classify non-recurring incidents that may be associated with non-recurrent congestion, including traffic accident, road construction, concert, special sport (marathon), and commercial activity. Results show that overall classification accuracy reaches 95%. Finally, the association relationship between the detected non-recurrent congestions and incidents is established via spatiotemporal information matching. This information matching provides a bidirectional verification. On the one hand, nearly 58% of non-recurrent congestion can be explained by incident-related (IR) microblogs. On the other hand, an average of 62% of IR microblogs can be traced by nearby non-recurrent congestions. This paper suggests that social media can be used as a secondary source and integrated with traffic data to enhance the understanding of non-recurrent congestion.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have