Abstract

Extracting meaningful information from noisy high-dimensional data is attracting increasing attention as richer and higher resolution data is being collected and used for transportation system planning and management purposes. Discovering critical information via effective data representation learning not only helps reduce data dimension, it also enables a deeper understanding of the underlying properties of noisy data, which could then lead to better planning and operations decisions. In this study, we present a new perspective that, unlike most existing approaches in the general data science literature, the design of data representation should go beyond the data itself; it should incorporate an understanding of how the data is used in the domain-specific applications. We further argue that this design philosophy is particularly important for transportation data because of the high spatial correlations of transportation data brought by network interdependence. We propose a usage-aware representation learning framework by incorporating the information loss for downstream application into the data encoding-decoding process. The proposed approach is formulated as a Stiefel manifold optimization problem. The effectiveness of the proposed framework is demonstrated in two network applications: modeling transportation network flows and estimating network-level vehicular emissions. The performance of the learned representation from our approach is compared with existing approaches using multiple evaluation context, including data reconstruction quality, clustering, anomaly detection, and critical information identification, through case studies implemented in Sioux Falls, Boston, and San Jose networks. The good performance of our approach consistently observed in those experiments indicates the importance of incorporating the downstream data usage in the process of data representation learning.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call