Abstract

Data quality is the foundation of data-driven applications in transportation. Data problems such as missing and invalid data could sharply reduce the performance of the methods used in these applications. Although there exist plenty of studies related to data quality issues, they only focus on missing or invalid data caused by infrastructure failures (e.g., loop detector malfunction). In general, there is a lack of attention to data quality issues from insufficient data management. This paper proposes a tensor decomposition based framework to tackle a specific missing data problem which occurs when the machine-station dictionary of an automated fare collection system database is incomplete. In such cases, there is a large amount of loss of origin/destination information as the affected machines are not linked to any station. Consequently, all associated transactions may miss the origin/destination information. The proposed framework recovers the dictionary by capturing features of the passenger flow passing through the unlinked fare machine. Evaluation results show that the proposed approach could recover the missing data with high accuracy even when several fare machines are not linked to a station. The framework could also support other beneficial applications.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.