Abstract
Recently we observe a significant increase in the amount of easily accessible data on transport and mobility. This data is mostly massive streams of high velocity, magnitude, and heterogeneity, which represent a flow of goods, shipments and the movements of fleet. It is therefore necessary to develop a scalable framework and apply tools capable of handling these streams. In the paper we propose an approach for the selection of software for stream processing solutions that may be used in the transportation domain. We provide an overview of potential stream processing technologies, followed by the method for choosing the selected software for real-time analysis of data streams coming from objects in motion. We have selected two solutions: Apache Spark Streaming and Apache Flink, and benchmarked them on a real-world task. We identified the caveats and challenges when it comes to implementation of the solution in practice.
Highlights
The recent rapid development of communication and detection technologies, the emergence of low-cost and widespread smart sensors, and a significant drop in data storage costs have all contributed to a significant increase in the amount of accessible data on transport and mobility
In order to design such a scalable and efficient architecture, a number of choices have to be made by designers, including the selection of proper software and hardware. In response to this challenge, we propose an approach for the se-lection of software for stream processing solutions that may be used in the transportation domain
We provide an overview and characterization of some stream processing technologies, followed by proposing a method for comparing the selected software for real-time analysis of data streams coming from objects in motion along with identification of the caveats and challenges when it comes to implementation of the software in practice
Summary
The recent rapid development of communication and detection technologies, the emergence of low-cost and widespread smart sensors, and a significant drop in data storage costs have all contributed to a significant increase in the amount of accessible data on transport and mobility. One of them is the transport sector that manages a massive flow of goods and at the same time creates large data sets These data streams concern inter alia mil-lions of shipments and the movements of fleet that are tracked every day [2]. Data extracted from computers embedded in vehicles, that concern an object in motion, i.e., its origin, destination, content, and location, can be used to better understand and predict the flow of goods and people in real time These data streams can be further combined with other data gathered from navigational systems, mobile phones (e.g., location, activities), environmental sensors (e.g., pollution levels), or social net-works (e.g., people’s preferences and relationships). The batch processing is dedicated for large quantities of data that is usually not time-sensitive
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have