Abstract

The ACM DEBS Grand Challenge 2015 focuses on real-time analytics over a high volume geospatial data stream composed of taxi trip reports from New York City. The goal of the challenge is to provide a solution which continuously identifies the most frequent routes (query 1) and most profitable areas (query 2) for taxis in New York City. The solution needs to process the incoming data stream in near real-time to provide valid information about taxi positions to end-users in a real-world deployment. We propose a modular processing engine design which is configured to offer efficient performance with a high data throughput and low processing latency. It consists of three main components: an input processor which pre-processes data objects to detect outliers, and two independent query processors tailored to the requirements of challenge queries. To efficiently compute query results, query processors use algorithms customized to the distribution of the taxi-generated data stream. Our experimental evaluation shows that the system can on average process 350,000 input events per second in a distributed mode, while achieving an average latency of less than 1 ms for both queries. Due to their excellent performance, the proposed algorithms are well suited for efficient tracking of a large number of vehicles that are present in modern urban areas.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.